Selenium Python - Webscraping Xpath Error

Question

I am trying to Web-scrape the country names from the following page - http://hdr.undp.org/en/composite/trends

I am trying to get the Xpath of the particular element.

So for the first country, it appears like this -

Country = driver.find_element_by_xpath("//[@id='styleSheet.css']/div/div/div/div/table/tbody/tr[2]/td[2]").text

So basically for all the countries, I am using the For loop and range function in python.

for i in range(2,193):
    try:
        print(i)
        Country = driver.find_element_by_xpath("//[@id='styleSheet.css']/div/div/div/div/table/tbody/tr["+int(i)+"]/td[11]").text
        print(Country)
    except Exception:
        print("none")

But the problem is the X-path doesn't work for me. Kindly help me in locating the right element.

I resolved the first problem by changing the int to str as that was the error throwing up.After that it says cannot locate the current element.

I don't think you can just concatenate a string to an int like that. You can use the '{}'.format() method. — SuperStew
– SuperStew, Commented Dec 5, 2017 at 15:14
Does the Exception provide any info? i.e. you can except Exception as e then print(e) — ContinuousLoad
– ContinuousLoad, Commented Dec 5, 2017 at 15:15
@SuperStew No the first place where I am taking the country itself is going wrong, even outside the loop. — Student of the Digital World
– Student of the Digital World, Commented Dec 5, 2017 at 15:16
a stacktrace would be much more helpful to others in debugging the problem. — Manmohan_singh
– Manmohan_singh, Commented Dec 5, 2017 at 15:16

alecxe · Accepted Answer · 2017-12-05 15:19:00Z

2

You don't have to use XPaths for every single selenium element location problem. There are better ways to locate the countries in this case. What if you would go through every tr element inside the tbody of the table and get the second td element containing a country name:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


driver = webdriver.Chrome()
driver.get("http://hdr.undp.org/en/composite/trends")

table = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".pane-content table")))
for row in table.find_elements_by_css_selector("tbody > tr")[1:]:  # skipping the first header row
    country = row.find_element_by_css_selector("td:nth-child(2)")

    print(country.text)

driver.close()

Prints:

Norway
Australia
Switzerland
...
San Marino
Somalia
Tuvalu

answered Dec 5, 2017 at 15:19

alecxe

476k127 gold badges1.1k silver badges1.2k bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Student of the Digital World Over a year ago

Why is it failing to pick up the value for years. Say for year 2012, I am trying to get it like this val = row.find_element_by_css_selector("td:nth-child(10)") but it is failing.

alecxe Over a year ago

@Sid29 right, cause not every single row has 10 or more columns, you have to skip the rows which don't. You can do that with a try/expect or check how much cells are there in a row. Thanks.

Collectives™ on Stack Overflow

Selenium Python - Webscraping Xpath Error

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related