Using python and Selenium to scrape the innerText within an HTML element?

Question

I wrote a script that uses the selenium and pyautogui modules to login and scrape a value from an element and print it, but it's printing two dashes --.

Here is the HTML that contains the value 417 which I want to retrieve:

<p id="totReqCountVal" class="trailer-0 avenir-regular font-size-4 text-green js-total-requests">417</p>

This is the relevant code I have tried:

from selenium import webdriver
from selenium.webdriver.common.by import By

browser.get('website_to_be_scraped')
browser.find_element(By.ID, 'totReqCountVal')

I then tried:

views = browser.find_element(By.ID, 'totReqCountVal')
    print(views)

which returns:

(session="12e48df447f7df855a1ee596ba609a30", element="1027ec31-8cb8-4758-b4b0-82b85628ed6c")

With some help I have also tried the following:

Using CSS_SELECTOR and text attribute:

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p#totReqCountVal[class$='js-total-requests']"))).text)
Using XPATH and get_attribute("innerHTML"):

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//p[@id='totReqCountVal' and contains(@class, 'js-total-requests')]"))).get_attribute("innerHTML"))

added the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

I have checked through devtools if the locator strategies identifies the element uniquely, checked for iframes, and shadow root.

How do I retrieve the 417 value?

undetected Selenium · Accepted Answer · 2022-02-10 19:30:07Z

1

views is the WebElement which on printing rightly prints:

(session="12e48df447f7df855a1ee596ba609a30", element="1027ec31-8cb8-4758-b4b0-82b85628ed6c")

Solution

To print the text 417 you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

Using CSS_SELECTOR and text attribute:

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p#totReqCountVal[class$='js-total-requests']"))).text)

Using XPATH and get_attribute("innerHTML"):

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//p[@id='totReqCountVal' and contains(@class, 'js-total-requests')]"))).get_attribute("innerHTML"))

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

References

Link to useful documentation:

get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

edited Feb 10, 2022 at 19:30

answered Feb 10, 2022 at 19:23

undetected Selenium

194k44 gold badges304 silver badges387 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

Anthony Stokes Over a year ago

Any reason why both Locator strategies print just two dashes, -- instead of 417?

undetected Selenium Over a year ago

Check through devtools if the locator strategies identifies the element uniquely.

Anthony Stokes Over a year ago

I checked the devtools and copied the CSSSelector which is: #totReqCountValf. I also checked the Xpath: //*[@id='totReqCountVal']. The only difference between this and the ones in the provided solution is the change from 'p' to '*'. Not sure how I would check if it's unique though.

undetected Selenium Over a year ago

Check the number of matches in the right side bottom. It should be 1 of 1 matches.

Anthony Stokes Over a year ago

okay, I have 1 of 1 match for both //*[@id='totReqCountVal'] and #totReqCountVal when using CTRL-F to search within the devtools

|

Collectives™ on Stack Overflow

Using python and Selenium to scrape the innerText within an HTML element?

1 Answer 1

Solution

References

11 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Solution

References

11 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related