Web Scraping data using python

Question

I m just started learning web scraping using Python. My aim is to web scrape the Realtime news for Bajaj Auto Ltd. from http://money.rediff.com/companies/Bajaj-Auto-Ltd/10540026.

The problem: I'm unable to extract the contents(i.e news).

from urllib.request import urlopen
from bs4 import BeautifulSoup

url = 'http://money.rediff.com/companies/Bajaj-Auto-Ltd/10540026'
data = urlopen(url)
soup = BeautifulSoup(data)

te=soup.find('a',attrs={'target':'_jbpinter'})
lis=te.find_all_next('a',attrs={'target':'_jbpinter'})
#print(lis)

for li in lis:
    print(li.find('a').contents[0])

I m getting the error "AttributeError: 'NoneType' object has no attribute 'contents'" And I does not get the desired result.

Any input will be appreciated.

looks like it can't find what you think is there. try printing li and see if there is actually an a in there — R Nar
– R Nar, Commented Nov 4, 2015 at 16:44

dstudeba · Accepted Answer · 2015-11-04 16:52:11Z

1

You are trying to get the a tag twice.

Replace

for li in lis:
    print(li.find('a').contents[0])

with

for li in lis:
    print(li.get_text())

and you get this output:

Need Different Rates For Different Products: Rahul Bajaj on GST
Reforms irrespective of Bihar results: Bajaj
Auto shares in focus; Tata Motors up over 5%
We believe new Avenger will stimulate the market: Bajaj Auto's Eric Vas
BHP Billiton pins future of Indonesian coal mine on new...

answered Nov 4, 2015 at 16:52

dstudeba

9,0383 gold badges34 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Web Scraping data using python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related