How to scrape some links from a website using selenium

Question

I've been trying to parse the links ended with 20012019.csv from a webpage using the below script but the thing is I'm always having timeout exception error. It occurred to me that I did things in the right way.

However, any insight as to where I'm going wrong will be highly appreciated.

My attempt so far:

from selenium import webdriver

url = 'https://promo.betfair.com/betfairsp/prices'

def get_info(driver,link):
    driver.get(link)
    for item in driver.find_elements_by_css_selector("a[href$='20012019.csv']"):
        print(item.get_attribute("href"))

if __name__ == '__main__':
    driver = webdriver.Chrome()
    try:
        get_info(driver,url)
    finally:
        driver.quit()

Selenium is overkill for this project. Have you considered using requests and BeautifulSoup? — nicholishen
– nicholishen, Commented Jan 20, 2019 at 18:05
The content are dynamic so I highly doubt requests can handle them @nicholishen. — MITHU
– MITHU, Commented Jan 21, 2019 at 4:20

SimonF · Accepted Answer · 2019-01-20 17:54:30Z

2

Your code is fine (tried it and it works), the reason you get a timeout is because the default timeout is 60s according to this answer and the page is huge.

Add this to your code before making the get request (to wait 180s before timeout):

driver.set_page_load_timeout(180)

answered Jan 20, 2019 at 17:54

SimonF

1,89513 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

undetected Selenium · Accepted Answer · 2019-01-21 01:40:27Z

0

You were close. You have to induce WebDriverWait for the the visibility of all elements located and you need to change the line:

for item in driver.find_elements_by_css_selector("a[href$='20012019.csv']"):

to:

for item in WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a[href$='20012019.csv']"))):

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

answered Jan 21, 2019 at 1:40

undetected Selenium

194k44 gold badges304 silver badges387 bronze badges

Collectives™ on Stack Overflow

How to scrape some links from a website using selenium

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related