Lately, I have been trying to do Web-Scraping and Crawling with Python and Selenium chromedriver. Its a reddit page which has threads and each thread has a title. When the title is clicked it goes to that particular thread. The thread consists of description and content.
What I am trying to do:
- Step 1) Visit a reddit page
- Step 2) Scan all the titles, store them in an array
- Step 3) Loop through each of the items in the Title array
- Step 4) Click on the title
- Step 5) Get the description
- Step 6) Go back
- Step 7) If titles are there Start from Step 3 Else click next and got next page and start from Step 1.
I have been able to get the titles and even get to the point where it clicks my title. But when it goes back, it is giving me an error at this line: data['title'].append(title.text) in the step 3 after clicking and coming back to the page once. And returns with an error message stating: "Message: stale element reference: element is not attached to the page document"
Not been able to debug this issue, as I am fairly new to python. Any help will be appreciated.
Here is the code:
for i in range(0,3):
titles = []
titles = browser.find_elements_by_css_selector(".title.may-blank")
for title in titles:
i = i+1
try:
data['title'].append(title.text)
except KeyError:
data['title'] = [title.text]
title.click()
description = browser.find_element_by_css_selector(".usertext-body.may-blank-within.md-container")
print description.text
browser.execute_script("window.history.go(-1)")
button = browser.find_element_by_class_name("next-button")
button.click()
print data['title']