1

I'm using urllib.request with python 3.4.6 to open https://www.ethz.ch/ (The actual url is longer but the problem is the same), which opens fine with Firefox but throws a 404 error with python.

Here is the code

from urllib.request import urlopen
connection = urlopen('https://www.ethz.ch/')

and it gives the following error message

Traceback (most recent call last):
  File "./generate_group_meetings_ical.py", line 9, in <module>
    connection = urlopen('https://www.ethz.ch/')
  File "/usr/lib64/python3.4/urllib/request.py", line 161, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib64/python3.4/urllib/request.py", line 470, in open
    response = meth(req, response)
  File "/usr/lib64/python3.4/urllib/request.py", line 580, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib64/python3.4/urllib/request.py", line 508, in error
    return self._call_chain(*args)
  File "/usr/lib64/python3.4/urllib/request.py", line 442, in _call_chain
    result = func(*args)
  File "/usr/lib64/python3.4/urllib/request.py", line 588, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not found UA

The code used to work fine though. Another piece of information is that I'm not root on the machine and python3 was upgraded from 3.4.5 to 3.4.6. So the comes either from the web server side or from the python side. I'm not a python nor a web expert so I couldn't figure it out myself.

Hope anybody can help me.

3
  • Sounds like a user agent problem, try setting the user agent string to something else to see if that's the problem. Commented Jul 5, 2017 at 9:40
  • Thanks a lot Francisco, it solved the problem. I posted an answer describing it. Commented Jul 5, 2017 at 12:56
  • Just had the same problem while building a scraper for exercise sheets at the same university lol Commented Sep 25, 2017 at 18:52

2 Answers 2

1

Thanks to Francisco's comment and that post I could make it work with the following code

from urllib.request import Request, urlopen
req = Request('https://www.ethz.ch/', headers={'User-Agent': 'Mozilla/5.0'})
connection = urlopen(req)

I also checked the original version with python 2.7.13 and urllib2 and it worked. Apparently python 3.5 works (answer from Laxmikant) and it was originally working under 3.4.5. So something happened in the upgrade from 3.4.5 to 3.4.6 that caused the error.

Sign up to request clarification or add additional context in comments.

Comments

0

@Pheidippides Check for typos if any in your entire url, it worked for me:

Python 3.5.2 (default, Nov 17 2016, 17:05:23) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
$>from urllib.request import urlopen
$>connection = urlopen('https://www.ethz.ch/')
$>connection.read()

1 Comment

Thanks @Laxmikant, I ran exactly the commands you entered but at least I now know that it works with python 3.5

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.