1

I'm trying to scrape movie ratings from Metacritic. Here's the part of the code which is throwing an error.

text = text.replace("_","-")
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7'
headers={'User-Agent':user_agent,} 
URL = "http://metacritic.com/" + text
request=urllib.request.Request(URL,None,headers)
try:
    response = urllib.request.urlopen(request)
    data = response.read()
    soup = BeautifulSoup(data,'html.parser')
    metacritic_rating = "Metascore: " + soup.find("span",class_="metascore_w").get_text()
    send_message(metacritic_rating,chat) 
except:
    pass

I modified what I had written using this: https://stackoverflow.com/a/42441391/8618880

I cannot use requests.get() because of this: urllib2.HTTPError: HTTP Error 403: Forbidden

I'm looking for a way to get the status code of the page. I was able to find out a way when I used requests.get().

I checked out all the answers with the title: urllib.error.HTTPError: HTTP Error 404: Not Found Python but could not find any help.

Any help is appreciated.

3
  • so what exactly are you wanting? You just want to print out the response code? Commented Mar 29, 2019 at 9:22
  • Yes. I want to compare it against 404 and do something. Commented Mar 29, 2019 at 9:28
  • If you catch the HTTPError exception you can get the HTTP sttatus with HTTPError.code. Commented Mar 29, 2019 at 9:53

1 Answer 1

1

I think this is what you want:

import urllib


user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7'
headers={'User-Agent':user_agent,} 
URL = "http://metacritic.com/" + text
request=urllib.request.Request(URL,None,headers)

try:
    response = urllib.request.urlopen(request)
    data = response.read()
    soup = BeautifulSoup(data,'html.parser')
    metacritic_rating = "Metascore: " + soup.find("span",class_="metascore_w").get_text()
    send_message(metacritic_rating,chat) 
except urllib.error.HTTPError as err:
    #print(err.code)
    if err.code == 403:
        <do something>
    else:
        pass

Output:

403
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.