0

I'm trying to iterate through the pages of a site using Selenium library, but I can only get the home page.

I redid my code to try to fix this problem, however, I only get the following message: InvalidSessionIdException: Message: invalid session id.

The code is below:

import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException

driver = webdriver.Chrome(executable_path=r'C:\MYPATH\chromedriver.exe')
options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors-spki-list')
options.add_argument('--ignore-ssl-errors')

title_list = []
date_list  = []
genre_list = []

for page_num in range(1, 11):
url = r"https://www.albumoftheyear.org/list/1500-rolling-stones-500-greatest-albums-of-all-time-2020/{}".format(page_num)
driver.get(url)

try:
    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "centerContent"))
    )
    albumlistrow = element.find_elements_by_class_name('albumListRow')
    for a in albumlistrow:
        title = a.find_element_by_class_name('albumListTitle')
        date = a.find_element_by_class_name('albumListDate')
        try:
            genre = a.find_element_by_class_name('albumListGenre')
        except NoSuchElementException:
            pass
        title_list.append(title.text)
        date_list.append(date.text)
        genre_list.append(genre.text)

finally:
    driver.close()

df = pd.DataFrame(list(zip(title_list,date_list,genre_list)), columns=['title', 'data','genre'])
df.head()
1
  • Could you please correctly format the code so it can be properly assessed? Some iterations/loops are incorrectly indented. Commented Sep 10, 2021 at 3:58

1 Answer 1

1

I'm not sure about your error, but since you said you have problem iterating over pages I tried to replicate that and found that they have bot pretection system by Cloudflare.

options.add_argument("--disable-blink-features=AutomationControlled")

This seems to fix the problem.

Tested with code below

options = Options()
options.add_argument("--disable-blink-features=AutomationControlled")
#ChromeDriverManager is for my local machine, you can use your exec_path
d = webdriver.Chrome(ChromeDriverManager().install(),options=options)
d.implicitly_wait(5)

for page_num in range(1,11):
    url = r"https://www.albumoftheyear.org/list/1500-rolling-stones-500-greatest-albums-of-all-time-2020/{}".format(page_num)
    d.get(url)
    sleep(3)

Imports

from selenium.webdriver.chrome.options import Options
Sign up to request clarification or add additional context in comments.

4 Comments

Unfortunately, the error still occurs. The following messages appear during execution: DevTools listening on ws://127.0.0.1:53530/devtools/browser/28978fe2-d8a9-46ee-bc85-0aa28c426868 [7572:11412:0910/093029.798:ERROR:ssl_client_socket_impl.cc(981)] handshake failed; returned -1, SSL error code 1, net_error -107 [13240:12308:0910/093032.997:ERROR:device_event_log_impl.cc(214)] [09:30:32.997] USB: usb_device_handle_win.cc:1048 Failed to read descriptor from node connection: A device connected to the system is not working. (0x1F)
@PSCM Code I posted works fine iterating through pages. If you only added my suggested fix on your code with .add_argument, try to create webdriver instance after creating options like I did in my code. This may cause your issues, because you're only declaring your chrome options, but you're not using them in an argument. So I think driver = webdriver.Chrome(executable_path=r'C:\MYPATH\chromedriver.exe') should be below the part where you created options and you should add it as an argument driver = webdriver.Chrome(executable_path=r'C:\MYPATH\chromedriver.exe', options= options)
@PSCM i'm not sure if what i suggested above is mandatory tho. Never tested it, I always use it as an argument when creating webdriver.
thanks! I fixed the issue with your solution.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.