Python BeautifulSoup Scrape Data writing to Excel "NotImplementedError"

Question

I'm trying to write a script for scraping a website with Python and BeautifulSoup, and then write the data into and excel sheet.

It works up until the writing section, then I get a NotImplementedError? I looked it up, and surrounded the write section of the code with TRY: and Pass: blocks....It solved the error in the Python interpreter console window, but my excel sheet was blank.

Here is what I have so far:

import requests, openpyxl
from bs4 import BeautifulSoup

wb = openpyxl.Workbook('RDWM_CRM.xls')
wb.create_sheet('Phone')
sheet = wb.get_sheet_by_name('Phone')

# nav to webpage I want to scrape
url = "http://www.yellowpages.com/search?search_terms=roofing%20company&geo_location_terms=New%20York%2C%20NY&page=2"
r = requests.get(url)
soup = BeautifulSoup(r.content)

# for loop finds info then prints
for div in soup.find_all("div", {"class": "info"}):
    print (div.contents[0].text)
    print (div.contents[1].text)            

# for loop finds info then writes to excel cells
for div in soup.find_all("div", {"class": "info"}):
    sheet['A1'] = div.contents[0].text
    sheet['B1'] = div.contents[1].text

wb.save('RDWM_CRM.xls')

Like I said above, even with no errors I was getting a blank excel sheet. here is the traceback as it is seen in the console:

Neptune Construction
Serving the New York Area.(866) 664-1759
>>> # for loop finds info then writes to excel cells
... for div in soup.find_all("div", {"class": "info"}):
...     sheet['A1'] = div.contents[0].text
...     sheet['B1'] = div.contents[1].text
...
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "C:\Users\Josh\AppData\Local\Programs\Python\Python35\lib\site-packages\openpyxl\writer\write_only.py", line 223, in removed_method
raise NotImplementedError
NotImplementedError
>>> wb.save('RDWM_CRM.xls')

this is the last piece of data as well as the error.

Thanks for the help!! I'm still running into the excel sheet being blank...here is the code I'm using, there are no errors....just a blank excel sheet. It creates the new sheet named Phone, it's just blank...

import requests
from bs4 import BeautifulSoup
from openpyxl import Workbook
url = "http://www.yellowpages.com/search?search_terms=roofing%20company&geo_location_terms=Seattle%2C%20WA&page=4" # nav to webpage I want to scrape
r = requests.get(url)
soup = BeautifulSoup(r.content)

# create a dummy list of texts to write to excel file
divs = []

wb = Workbook() # open new workbook, use load_workbook if existing
ws = wb.create_sheet('Phone')
for div in divs:
    row = [div.contents[0].text, div.contents[1].text]  # construct a row: shown only for example purposes
    ws.append(row)          # could use ws.append(div) since each div is a list 

wb.save('RDWM_CRM.xlsx')     # save workbook, will overwrite if exists

Any help is appreciated!!

Please include the traceback, did the error happen in the wb.save? — memoselyk
– memoselyk, Commented Dec 22, 2015 at 23:56
Traceback (most recent call last): File "<stdin>", line 3, in <module> File "C:\Users\Josh\AppData\Local\Programs\Python\Python35\lib\site-packages\o penpyxl\writer\write_only.py", line 223, in removed_method raise NotImplementedError NotImplementedError >>> wb.save('RDWM_CRM.xls') — user3429394
– user3429394, Commented Dec 23, 2015 at 0:06
@user3429394 please edit your question and put the full text of the traceback there. — MattDMo
– MattDMo, Commented Dec 23, 2015 at 0:09

Steve Misuta · Accepted Answer · 2015-12-23 02:16:51Z

2

Apologies in advance if I don't completely understand your question, but there appears to be some issues with the use of openpyxl.

Here is an example case of how to write worksheets using openpyxl that may be helpful:

from openpyxl import Workbook

# create a dummy list of texts to write to excel file
divs = [[chr(i)*8, chr(i+1)*8] for i in range(65, 75, 1)]

wb = Workbook()             # open new workbook, use load_workbook if existing
ws = wb.create_sheet(title="Example")
for div in divs:
    row = [div[0], div[1]]  # construct a row: shown only for example purposes
    ws.append(row)          # could use ws.append(div) since each div is a list 
wb.save('example.xlsx')     # save workbook, will overwrite if exists

The dummy list divs looks like this:

[['AAAAAAAA', 'BBBBBBBB'],
 ['BBBBBBBB', 'CCCCCCCC'],
 ['CCCCCCCC', 'DDDDDDDD'],
 ['DDDDDDDD', 'EEEEEEEE'],
 ['EEEEEEEE', 'FFFFFFFF'],
 ['FFFFFFFF', 'GGGGGGGG'],
 ['GGGGGGGG', 'HHHHHHHH'],
 ['HHHHHHHH', 'IIIIIIII'],
 ['IIIIIIII', 'JJJJJJJJ'],
 ['JJJJJJJJ', 'KKKKKKKK']]

And the excel file 'example.xlsx' has this worksheet 'example':

   A        B
1  AAAAAAAA BBBBBBBB
2  BBBBBBBB CCCCCCCC
3  CCCCCCCC DDDDDDDD
4  DDDDDDDD EEEEEEEE
5  EEEEEEEE FFFFFFFF
6  FFFFFFFF GGGGGGGG
7  GGGGGGGG HHHHHHHH
8  HHHHHHHH IIIIIIII
9  IIIIIIII JJJJJJJJ
10 JJJJJJJJ KKKKKKKK

You would construct a row something like this:

row = [div.contents[0].text, div.contents[1].text]

assuming that div.contents is correct. Hope this helps. PS. I am using openpyxl version 2.3.0

answered Dec 23, 2015 at 2:16

Steve Misuta

1,0337 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

user3429394 Over a year ago

Thank you for your help!

user3429394 Over a year ago

I'm still running into the excel sheet being blank, here is my revised code:

Steve Misuta Over a year ago

Were you able to copy the code I posted, run it on your system and output the excel file example.xlsx?

user3429394 Over a year ago

no... this is what I get when I copy and paste your code:

Collectives™ on Stack Overflow

Python BeautifulSoup Scrape Data writing to Excel "NotImplementedError"

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related