3

I am gathering links from a website. I iterate over the pages that it has, and on each page I retrieve the links with:

links = driver.find_elements_by_xpath('//*[contains(@class, "m_rs_list_item_main")]/div[1]/div[1]/a')

Now... sometimes the website fails and does not show the links that it should. For instance, it normally says:

link1

link2

...

link N

page M

And suddenly there is a page, let's say M+1 that doesn't show any links at all. Then the code gets stuck at the above line (links = ...) "looking for" the links. I count the links with a counter in order to see how many links in each page I have:

if numlinks_inrun == 0:
    print('nolinks')

Now, I never get the message 'nolinks' printed. When I press CTRL+C to abort the program in the terminal, I get in the terminal the traceback:

links = driver.find_elements_by_xpath('//*[contains(@class, "m_rs_list_item_main")]/div[1]/div[1]/a')
  File "/home/vladimir/anaconda3/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 305, in find_elements_by_xpath
    return self.find_elements(by=By.XPATH, value=xpath)

This is why I know that the program gets stuck at this point. Does anyone know how to set a timeout so that selenium does not search forever those unexisting links?

2
  • @VladimirVargas I am getting a message as This request was blocked by the security rules when I try to access the website. Do we have any alternative? Thanks Commented Jun 2, 2017 at 14:27
  • It looks like a particular page or some page taking long time to loading it. You can set page load timeout. I don't think it is because of find element which is implicitly wait as by default, if it is not finding any element, it will not wait until you set implicitly timeout. Commented Jun 2, 2017 at 22:36

1 Answer 1

3

This appears to be an issue with the element not actually loading in time for selenium to locate it. You may have to consider adding an explicit wait which you can use to set the amount of seconds prior to selenium locating the specified page element. That's why you're not seeing the "nolinks" output because it errors out.

Context: https://selenium-python.readthedocs.io/waits.html#explicit-waits

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.