1

I wrote a script that uses the selenium and pyautogui modules to login and scrape a value from an element and print it, but it's printing two dashes --.

Here is the HTML that contains the value 417 which I want to retrieve:

<p id="totReqCountVal" class="trailer-0 avenir-regular font-size-4 text-green js-total-requests">417</p>

This is the relevant code I have tried:

from selenium import webdriver
from selenium.webdriver.common.by import By

browser.get('website_to_be_scraped')
browser.find_element(By.ID, 'totReqCountVal')

I then tried:

views = browser.find_element(By.ID, 'totReqCountVal')
    print(views)

which returns:

(session="12e48df447f7df855a1ee596ba609a30", element="1027ec31-8cb8-4758-b4b0-82b85628ed6c")

With some help I have also tried the following:

Using CSS_SELECTOR and text attribute:

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p#totReqCountVal[class$='js-total-requests']"))).text)
Using XPATH and get_attribute("innerHTML"):

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//p[@id='totReqCountVal' and contains(@class, 'js-total-requests')]"))).get_attribute("innerHTML"))

added the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

I have checked through devtools if the locator strategies identifies the element uniquely, checked for iframes, and shadow root.

How do I retrieve the 417 value?

1 Answer 1

1

views is the WebElement which on printing rightly prints:

(session="12e48df447f7df855a1ee596ba609a30", element="1027ec31-8cb8-4758-b4b0-82b85628ed6c")

Solution

To print the text 417 you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and text attribute:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p#totReqCountVal[class$='js-total-requests']"))).text)
    
  • Using XPATH and get_attribute("innerHTML"):

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//p[@id='totReqCountVal' and contains(@class, 'js-total-requests')]"))).get_attribute("innerHTML"))
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


References

Link to useful documentation:

Sign up to request clarification or add additional context in comments.

11 Comments

Any reason why both Locator strategies print just two dashes, -- instead of 417?
Check through devtools if the locator strategies identifies the element uniquely.
I checked the devtools and copied the CSSSelector which is: #totReqCountValf. I also checked the Xpath: //*[@id='totReqCountVal']. The only difference between this and the ones in the provided solution is the change from 'p' to '*'. Not sure how I would check if it's unique though.
Check the number of matches in the right side bottom. It should be 1 of 1 matches.
okay, I have 1 of 1 match for both //*[@id='totReqCountVal'] and #totReqCountVal when using CTRL-F to search within the devtools
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.