How to wait for text value to show using Selenium Python

Question

I am trying to get the value of an element that renders text upon clicking a dropdown. I am currently using implicity_wait() to make sure the element is appearing, but when I run the script, the .text call returns empty strings. If I slowly run each line of the script the .text values populate. Based on this i assume that I have to wait for the text to render, but I can't work out how to do this.

Looking at the expected conditions documentation all the of the text_to_be_present_... conditions want me to know what text I am waiting for. Since I am webscraping I don't know this and so I am trying to pass a regex condition to the text_ argument, that matches a generic form of the value I am looking for. I am not getting the expected result with the value still returning an empty string when I run the script.

Here is the code I am trying:

from selenium import webdriver 
from selenium.webdriver.chrome.options import Options 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.support.ui import WebDriverWait

#Set the options for running selenium as headless
options = Options()
options.headless = True
options.add_argument("--window-size=1920,1200")

#Create the driver object
driver = webdriver.Chrome(options=options, executable_path=DRIVER_PATH)
driver.implicitly_wait(10)

output = []
driver.get(html)
nat_res_element = driver.find_element_by_xpath('//*[@id="accordion-theme"]/div[1]/div[1]/span')
nat_res_element.click()
element = WebDriverWait(driver, 10).until(EC.text_to_be_present_in_element_value(locator = By.xpath('//*[@id="collapse0"]/div/div/ul/li/span[2]'), text_ = '[\d].*'))
output.append(element.text)

The url is: https://projects.worldbank.org/en/projects-operations/project-detail/P159382. I am trying to access the values under the 'Environment and Natural Resource Management' dropdown. Since this is digit; digit; %, I am trying regex [\d].*.

Welcome a way to handle this.

All of the below solutions work. For my purposes I simply called element.click() and then put driver.page_source into BS, and re-ran my old code - as per @Celius Stingher's suggestion. That said, i think @F.Hoque's answer is technically the best answer to my question on waiting on an expected condition, and then calling .text on the element. Nonetheless @undetected Selenium's answer works as well - calling get_attribute("innerHTML"). Thanks very much for all the help. — MorrisseyJ
– MorrisseyJ, Commented Jul 30, 2022 at 2:14
@undetectedSelenium's answer not only speaks about get_attribute("innerHTML") but also demonstrates 4 options involving Css/text + XPath/getAttribute, along with an explanation why text_to_be_present_in_element_value() doesn't suits your usecase. — undetected Selenium
– undetected Selenium, Commented Jul 30, 2022 at 7:21

Md. Fazlul Hoque · Accepted Answer · 2022-07-29 21:28:36Z

climate_change = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, '(//*[@class="twolevel"]//li//span)[2]'))).text
adaptation = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, '(//*[@class="twolevel"]//li//span)[4]'))).text
mitigation = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, '(//*[@class="twolevel"]//li//span)[6]'))).text

The above xpath expressions will pull the desired data from the 'Environment and Natural Resource Management' dropdown.

It's working fine with non-headless browser.

Full Script:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.add_argument("--window-size=1920,1200")
#options.add_argument("--headless")


s = Service("./chromedriver") ## path to where you saved chromedriver binary
driver = webdriver.Chrome(service=s, options=options)

url = 'https://projects.worldbank.org/en/projects-operations/project-detail/P159382'
driver.get(url)
time.sleep(5)

nat_res_element = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="accordion-theme"]/div[1]/div[1]/span')))
nat_res_element.click()
data=[]
climate_change = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, '(//*[@class="twolevel"]//li//span)[2]'))).text
adaptation = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, '(//*[@class="twolevel"]//li//span)[4]'))).text
mitigation = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, '(//*[@class="twolevel"]//li//span)[6]'))).text
data.append({
    'Climate change':climate_change,
    'Adaptation':adaptation,
    'Mitigation':mitigation
    })

print(data)

driver.quit()

Output:

[{'Climate change': '64%', 'Adaptation': '32%', 'Mitigation': '32%'}]

Celius Stingher · Accepted Answer · 2022-07-29 18:54:44Z

1

I usually like to combine Selenium with BeautifulSoup. Thank for sharing all the details, this would be my approach:

driver.get("https://projects.worldbank.org/en/projects-operations/project-detail/P159382")

raw_source = driver.page_source
parsed = BeautifulSoup(raw_source,"html.parser")

variables = [x.text for x in parsed.find_all(class_='table-accordion-wrapper ta-block ng-star-inserted')[0].find_all(class_='proj-theme')]
values = [x.text for x in parsed.find_all(class_='table-accordion-wrapper ta-block ng-star-inserted')[0].find_all(class_='proj-theme-percentage')]

df = pd.DataFrame({'variables':variables,'values':values})


print(df)

Returns:

        variables values
0  Climate change    64%
1      Adaptation    32%
2      Mitigation    32%

The first find_all accesses the Theme table, which contains 4 (expandables) tables. Given we only want the first one, I am forcing a [0] after the first find_all(). (but if you'd like the other values from the other tables you can make a listed nest comprehension). The second find_all(), iterates over the rows in the subtable, accessing Climate, Adaptation and Mitigation.

You can of course further manipulate to generate a formar you'd like such as:

df = df.set_index('variables').T

Returning:

variables Climate change Adaptation Mitigation
values               64%        32%        32%

answered Jul 29, 2022 at 18:54

Celius Stingher

18.4k6 gold badges26 silver badges54 bronze badges

3 Comments

MorrisseyJ Over a year ago

Thanks for this. My initial approach combined Selenium and BS, but i found that some pages would lose the data hidden in the dropdown when turning into a BS object (see this case for example: projects.worldbank.org/en/projects-operations/project-detail/…). It was based on this, that i switched to trying to find and access the elements using Selenium methods. On this front, i should apologize for a typo in my original question. I left out the code where i called click on the dropdown element. Fixed now.

Celius Stingher Over a year ago

You should be able to click to expand with selenium and then use page_source to get the expanded date.

MorrisseyJ Over a year ago

Great, even though this doesn't answer my question, this is probably easiest for me given I already have the code written to parse the BS objects.

undetected Selenium · Accepted Answer · 2022-07-29 20:39:09Z

text_to_be_present_in_element_value()

text_to_be_present_in_element_value() is the expectation for checking if the given text is present in the element’s value and is defined as:

def text_to_be_present_in_element_value(locator, text_):
    """
    An expectation for checking if the given text is present in the element's value.
    locator, text
    """

    def _predicate(driver):
    try:
        element_text = driver.find_element(*locator).get_attribute("value")
        return text_ in element_text
    except StaleElementReferenceException:
        return False

    return _predicate

This usecase

You need to consider a couple of things here as follows:

Expected Condition of text_to_be_present_in_element_value() checks if the given text is present in the element's value attribute but not the text / innerText which is 64%
Expected Condition doesn't support regex, as a result the supplied regex [\d].* will be considered as a string.

Solution

To extract the text 64% ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

Using CSS_SELECTOR and text attribute:

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div#collapse0 ul.twolevel li.firstlevel span.proj-theme +span"))).text)

Using XPATH and get_attribute("innerHTML"):

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[.='Climate change']//following::span[1]"))).get_attribute("innerHTML"))

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

This works. Thanks for the help. Chose another answer, as i thought it was closer to my specific question. Nonetheless, thanks for the help on the specific usecase and solution.

Collectives™ on Stack Overflow

How to wait for text value to show using Selenium Python

3 Answers 3

Comments

3 Comments

text_to_be_present_in_element_value()

This usecase

Solution

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

3 Comments

text_to_be_present_in_element_value()

This usecase

Solution

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related