0

I'm trying to scrape text from a series of hyperlinks on a main page and then store the results as a list of string objects. The code I've written works when I perform it on an individual link, but it breaks down when I try to loop through all the links.

FYI, my base url looks like this:

base_url = "http://www.achpr.org"

And my hyperlinks look like this:

hyperlinks = ['/sessions/58th', 
'/sessions/58th/resolutions/337/', 
'/sessions/58th/resolutions/338/', 
'/sessions/58th/resolutions/339/', ...]

So this works fine:

r = requests.get('http://www.achpr.org' + "/sessions/19th-eo/resolutions/328/")
    soup = BeautifulSoup(r.text, "lxml")
    soup.find('b').span.string
    text = soup.findAll('span')

y = []
for i in text:
    x = i.strings #returns string within tags
    y.extend(x)

y = "".join(y)
y = y.replace("\n", " ")
y = y.replace("\xa0*", " ")
print(ok)

But when I try to turn this into a loop:

output = []

for item in hyperlinks:
    r = requests.get('http://www.achpr.org' + link)
    soup = BeautifulSoup(r.text, "lxml")
    soup.find('b').span.string
    text = soup.findAll('span')

    y = []
    for i in text:
        x = i.strings #returns string within tags (so no tags)
        y.extend(x)

    y = "".join(y)
    y = y.replace("\n", " ")
    y = y.replace("\xa0*", " ")
    output.extend(y)

I get the following error:

Error message

It feels like I'm making a really simple looping error (putting indents in the wrong place), but I've been staring at this too long and I'd like a fresh pair of eyes. Can anyone spot what I'm doing wrong?

1 Answer 1

1

It's not an indent error I suppose.

for item in hyperlinks:
    r = requests.get('http://www.achpr.org' + link)
    soup = BeautifulSoup(r.text, "lxml")
    if soup.find('b').span is None:
        continue
    soup.find('b').span.string
    text = soup.findAll('span')

Add an if test before soup.find('b').span.string.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.