Using an HTTP PROXY - Python [duplicate]

Question

I familiar with the fact that I should set the HTTP_RPOXY environment variable to the proxy address.

Generally urllib works fine, the problem is dealing with urllib2.

>>> urllib2.urlopen("http://www.google.com").read()

returns

urllib2.URLError: <urlopen error [Errno 10061] No connection could be made because the target machine actively refused it>

or

urllib2.URLError: <urlopen error [Errno 11004] getaddrinfo failed>

Extra info:

urllib.urlopen(....) works fine! It is just urllib2 that is playing tricks...

I tried @Fenikso answer but I'm getting this error now:

URLError: <urlopen error [Errno 10060] A connection attempt failed because the 
connected party did not properly respond after a period of time, or established
connection failed because connected host has failed to respond>

Any ideas?

Can you post actual whole sample code which gives you the error? — Fenikso
– Fenikso, Commented Apr 11, 2011 at 11:12
@Fenikso: this urllib2.urlopen("http://www.google.com").read() — RadiantHex
– RadiantHex, Commented Apr 11, 2011 at 11:25
So you have the proxy server set in HTTP_PROXY environment variable? Are you sure that server accepts the connection? — Fenikso
– Fenikso, Commented Apr 11, 2011 at 11:28

Fenikso · Accepted Answer · 2014-01-29 11:50:24Z

73

You can do it even without the HTTP_PROXY environment variable. Try this sample:

import urllib2

proxy_support = urllib2.ProxyHandler({"http":"http://61.233.25.166:80"})
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)

html = urllib2.urlopen("http://www.google.com").read()
print html

In your case it really seems that the proxy server is refusing the connection.

Something more to try:

import urllib2

#proxy = "61.233.25.166:80"
proxy = "YOUR_PROXY_GOES_HERE"

proxies = {"http":"http://%s" % proxy}
url = "http://www.google.com/search?q=test"
headers={'User-agent' : 'Mozilla/5.0'}

proxy_support = urllib2.ProxyHandler(proxies)
opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler(debuglevel=1))
urllib2.install_opener(opener)

req = urllib2.Request(url, None, headers)
html = urllib2.urlopen(req).read()
print html

Edit 2014: This seems to be a popular question / answer. However today I would use third party requests module instead.

For one request just do:

import requests

r = requests.get("http://www.google.com", 
                 proxies={"http": "http://61.233.25.166:80"})
print(r.text)

For multiple requests use Session object so you do not have to add proxies parameter in all your requests:

import requests

s = requests.Session()
s.proxies = {"http": "http://61.233.25.166:80"}

r = s.get("http://www.google.com")
print(r.text)

edited Jan 29, 2014 at 11:50

answered Apr 11, 2011 at 11:26

Fenikso

9,5215 gold badges47 silver badges73 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

RadiantHex Over a year ago

Thanks for the reply! :) Now I'm getting

URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>

... urllib works perfectly though.

Fenikso Over a year ago

@RadiantHex - Works fine on my system. Do you have any proxy you have to use for internet access?

Fenikso Over a year ago

@RadiantHex - What is also the type of proxy you use?

RadiantHex Over a year ago

@Fenikso: I do have to use an http proxy for internet access, and it is the same I use for all my software to get internet access. It is the same proxy I have set within the HTTP_PROXY variable.

Fenikso Over a year ago

@RadiantHex - So was it the proxy refusing connection because of user-agent?

|

abeusher · Accepted Answer · 2011-12-03 23:08:36Z

19

I recommend you just use the requests module.

It is much easier than the built in http clients: http://docs.python-requests.org/en/latest/index.html

Sample usage:

r = requests.get('http://www.thepage.com', proxies={"http":"http://myproxy:3129"})
thedata = r.content

answered Dec 3, 2011 at 23:08

abeusher

3512 silver badges4 bronze badges

3 Comments

User Over a year ago

How do you set the timeout?

Heidi Over a year ago

Wonderful. This works with both https and http, whereas urllib only works with http for me with python3.

ballade4op52 Over a year ago

I thought this was working for me, but tried putting random proxy information, and data was still retrieved each time (as long as https was used)

Andreas Covidiot · Accepted Answer · 2013-11-25 09:23:48Z

6

Just wanted to mention, that you also may have to set the https_proxy OS environment variable in case https URLs need to be accessed. In my case it was not obvious to me and I tried for hours to discover this.

My use case: Win 7, jython-standalone-2.5.3.jar, setuptools installation via ez_setup.py

answered Nov 25, 2013 at 9:23

Andreas Covidiot

4,8335 gold badges55 silver badges106 bronze badges

Comments

user136036 · Accepted Answer · 2013-06-26 13:54:24Z

4

Python 3:

import urllib.request

htmlsource = urllib.request.FancyURLopener({"http":"http://127.0.0.1:8080"}).open(url).read().decode("utf-8")

answered Jun 26, 2013 at 13:54

user136036

12.5k6 gold badges48 silver badges47 bronze badges

1 Comment

IFink Over a year ago

from the TraceBack: DeprecationWarning: FancyURLopener style of invoking requests is deprecated. Use newer urlopen functions/methods.

Standin.Wolf · Accepted Answer · 2020-03-03 15:05:15Z

0

I encountered this on jython client. The server was only talking TLS and the client using SSL context.

javax.net.ssl.SSLContext.getInstance("SSL")

Once the client was to TLS, things started working.

edited Mar 3, 2020 at 15:05

Standin.Wolf

1,2341 gold badge10 silver badges33 bronze badges

answered Nov 10, 2014 at 9:38

ashoka.devanampriya

2172 silver badges8 bronze badges

Collectives™ on Stack Overflow

Using an HTTP PROXY - Python [duplicate]

Extra info:

urllib.urlopen(....) works fine! It is just urllib2 that is playing tricks...

5 Answers 5

11 Comments

3 Comments

Comments

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

Extra info:

urllib.urlopen(....) works fine! It is just urllib2 that is playing tricks...

5 Answers 5

11 Comments

3 Comments

Comments

1 Comment

Comments

Linked

Related