51

I familiar with the fact that I should set the HTTP_RPOXY environment variable to the proxy address.

Generally urllib works fine, the problem is dealing with urllib2.

>>> urllib2.urlopen("http://www.google.com").read()

returns

urllib2.URLError: <urlopen error [Errno 10061] No connection could be made because the target machine actively refused it>

or

urllib2.URLError: <urlopen error [Errno 11004] getaddrinfo failed>

Extra info:

urllib.urlopen(....) works fine! It is just urllib2 that is playing tricks...

I tried @Fenikso answer but I'm getting this error now:

URLError: <urlopen error [Errno 10060] A connection attempt failed because the 
connected party did not properly respond after a period of time, or established
connection failed because connected host has failed to respond>      

Any ideas?

3
  • Can you post actual whole sample code which gives you the error? Commented Apr 11, 2011 at 11:12
  • @Fenikso: this urllib2.urlopen("http://www.google.com").read() Commented Apr 11, 2011 at 11:25
  • So you have the proxy server set in HTTP_PROXY environment variable? Are you sure that server accepts the connection? Commented Apr 11, 2011 at 11:28

5 Answers 5

73

You can do it even without the HTTP_PROXY environment variable. Try this sample:

import urllib2

proxy_support = urllib2.ProxyHandler({"http":"http://61.233.25.166:80"})
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)

html = urllib2.urlopen("http://www.google.com").read()
print html

In your case it really seems that the proxy server is refusing the connection.


Something more to try:

import urllib2

#proxy = "61.233.25.166:80"
proxy = "YOUR_PROXY_GOES_HERE"

proxies = {"http":"http://%s" % proxy}
url = "http://www.google.com/search?q=test"
headers={'User-agent' : 'Mozilla/5.0'}

proxy_support = urllib2.ProxyHandler(proxies)
opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler(debuglevel=1))
urllib2.install_opener(opener)

req = urllib2.Request(url, None, headers)
html = urllib2.urlopen(req).read()
print html

Edit 2014: This seems to be a popular question / answer. However today I would use third party requests module instead.

For one request just do:

import requests

r = requests.get("http://www.google.com", 
                 proxies={"http": "http://61.233.25.166:80"})
print(r.text)

For multiple requests use Session object so you do not have to add proxies parameter in all your requests:

import requests

s = requests.Session()
s.proxies = {"http": "http://61.233.25.166:80"}

r = s.get("http://www.google.com")
print(r.text)
Sign up to request clarification or add additional context in comments.

11 Comments

Thanks for the reply! :) Now I'm getting URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>... urllib works perfectly though.
@RadiantHex - Works fine on my system. Do you have any proxy you have to use for internet access?
@RadiantHex - What is also the type of proxy you use?
@Fenikso: I do have to use an http proxy for internet access, and it is the same I use for all my software to get internet access. It is the same proxy I have set within the HTTP_PROXY variable.
@RadiantHex - So was it the proxy refusing connection because of user-agent?
|
19

I recommend you just use the requests module.

It is much easier than the built in http clients: http://docs.python-requests.org/en/latest/index.html

Sample usage:

r = requests.get('http://www.thepage.com', proxies={"http":"http://myproxy:3129"})
thedata = r.content

3 Comments

How do you set the timeout?
Wonderful. This works with both https and http, whereas urllib only works with http for me with python3.
I thought this was working for me, but tried putting random proxy information, and data was still retrieved each time (as long as https was used)
6

Just wanted to mention, that you also may have to set the https_proxy OS environment variable in case https URLs need to be accessed. In my case it was not obvious to me and I tried for hours to discover this.

My use case: Win 7, jython-standalone-2.5.3.jar, setuptools installation via ez_setup.py

Comments

4

Python 3:

import urllib.request

htmlsource = urllib.request.FancyURLopener({"http":"http://127.0.0.1:8080"}).open(url).read().decode("utf-8")

1 Comment

from the TraceBack: DeprecationWarning: FancyURLopener style of invoking requests is deprecated. Use newer urlopen functions/methods.
0

I encountered this on jython client. The server was only talking TLS and the client using SSL context.

javax.net.ssl.SSLContext.getInstance("SSL")

Once the client was to TLS, things started working.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.