1

I am using Python 3 and Selenium to grab some image links from a website as below:

import sys
import os
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.proxy import Proxy, ProxyType

chrome_options = Options()  
chrome_options.add_argument("--headless")

driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get('https://www.sky.com/tv-guide/20200605/4101-1/107/Efe2-364')

link_xpath = '/html/body/main/div/div[2]/div[2]/div/div/div[2]/div/div[2]/div[1]/div/div/div[2]/div/img'

link_path = driver.find_element_by_xpath(link_xpath).text
print(link_path)

driver.quit()

When parsing this URL you can see the image in question in the middle of the page. When you right click in Google Chrome and inspect element, you can then right click the element itself within Chrome Dev Tools and get the xpath for this image.

All looks in order to me, however when running the above code I get the following error:

Traceback (most recent call last):
  File "G:\folder\folder\testfilepy", line 16, in <module>
    link_path = driver.find_element_by_xpath(link_xpath).text
  File "G:\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 394, in find_element_by_xpath
    return self.find_element(by=By.XPATH, value=xpath)
  File "G:\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 978, in find_element
    'value': value})['value']
  File "G:\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "G:\Python36\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/main/div/div[2]/div[2]/div/div/div[2]/div/div[2]/div[1]/div/div/div[2]/div/img"}
  (Session info: headless chrome=83.0.4103.61)

Can anyone tell me why Selenium is unable to find the xpath provided?

2
  • Try this link_xpath = '//div[@class="c-bezel programme-content__image"]//img', but actually the element has no text to return, what do you want to achieve, what are the attributes? Commented Jun 5, 2020 at 11:00
  • hi - when inspecting the element i see a http link to the image: images.metadata.sky.com/pd-image/… ....i want to grab that link basically Commented Jun 5, 2020 at 11:02

4 Answers 4

1

To extract the src attribute of the image you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR:

    options = webdriver.ChromeOptions() 
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    options.add_argument('--headless')
    options.add_argument('--window-size=1920,1080')
    driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
    driver.get('https://www.sky.com/tv-guide/20200605/4101-1/107/Efe2-364')
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.o-layout__item div.c-bezel.programme-content__image>img"))).get_attribute("src"))
    
  • Using XPATH:

    options = webdriver.ChromeOptions() 
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    options.add_argument('--headless')
    options.add_argument('--window-size=1920,1080')
    driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
    driver.get('https://www.sky.com/tv-guide/20200605/4101-1/107/Efe2-364')     
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='o-layout__item']//div[@class='c-bezel programme-content__image']/img"))).get_attribute("src"))
    
  • Console Output:

    https://images.metadata.sky.com/pd-image/251eeec2-acb3-4733-891b-60f10f2cc28c/16-9/640
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

Reference

You can find a couple of detailed discussion on NoSuchElementException in:

Sign up to request clarification or add additional context in comments.

Comments

1

You have the correct xpath, but don't use absolute paths, it's very vulnerable to change. Try this relative xpath : //div[@class="c-bezel programme-content__image"]//img.

And to achieve you mean, please use .get_attribute("src") not .text

driver.get('https://www.sky.com/tv-guide/20200605/4101-1/107/Efe2-364')
element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, '//div[@class="c-bezel programme-content__image"]//img')))
print(element.get_attribute("src"))
driver.quit()

Or better way, use css selector. This should be faster:

element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.c-bezel.programme-content__image > img')))

Reference : https://selenium-python.readthedocs.io/locating-elements.html

2 Comments

hi - i did not know you could parse xpaths in this manner, but will make sure i am doing it via this method in the future. thanks.
@gdogg371 welcome, just for reference locating-elements-by-xpath and locating-elements-by-css-selectors
0

If you are working in headless mode, it usually is a good idea to add window size. Add this line to your options:

chrome_options.add_argument('window-size=1920x1080')

8 Comments

why? what does this option do?
...actually, i can see that this no longer throws an error, although i do not know why...however, it appears to now return a blank string, as i can see no text returned at all...
You would see no text as you are pointing to an img tag. You can manually check the DOM to confirm the text is "".
driver.find_element_by_xpath(link_xpath).get_attribute('src') does that.
Honestly it's something I have learned from experience. Some websites will need faking window size to work with their DOM in headless mode. It's not obvious, nor clearly documented unfortunately.
|
0

Your xpath seems to be correct. You wasn't able to locate because you forgot to handle the cookie. Try it by yourself. Put the driver on hold for few seconds and click agree to all cookies. And then you will see your element. There are multiple way to handle cookie. I was able to locate xpath by using my own xpath which is cleaner. I visit that element from nearest parent.

Hope this help.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.