Python Selenium using For Loop to access element

Question

Basically I want all info from Others page from the first page to the last, I try:

The website's a bit strange..I want get all issuer and other info under 'POST ISSUANCE'

driver.get('https://www.chinabondconnect.com/en/Primary/Primary-Information/Onshore.html')
wait = WebDriverWait(driver, 30)
driver.find_element_by_link_text('Others').click()
for i in range(1,20):
        pg = "tb2tr pg" + str(i)
        allitems = driver.find_element_by_xpath('//*[@id="td7"]/tbody/tr[@class=pg])')
        for i in range(len(allitems)):
            issuer = driver.find_element_by_xpath('(//tr[@class=pg]//td[1]//div[2]//div)').text
            print(issuer)

it says not a valid xpath..

Could someone help with this?

Thanks!!

hi, better with selenium

Joyce
– Joyce

2021-03-02 10:40:27 +00:00
Commented Mar 2, 2021 at 10:40 — Joyce
– Joyce, Commented Mar 2, 2021 at 10:40

KunduK · Accepted Answer · 2021-03-02 14:39:21Z

1

Use find_elements() to get all the records and use get_attribute("textContent") to get the hidden node value.

for item in driver.find_elements_by_xpath("//table[@id='tb7']//tr[starts-with(@class,'tb2tr pg')]//td[1]/div[2]/div"):
    print(item.get_attribute("textContent"))

Output:

Central Huijin Investment Ltd.
Dongguan Rural Commercial Bank Co., Ltd.
Gemdale (Group) Co., Ltd.
Everbright Securities
China securities co ltd
Bank of China 
Jinan Rail Transit Group Co., Ltd.
Ping An Bank Co., Ltd.
Shaanxi Financial Holding Group Co., Ltd.
Bank of Suzhou Co., Ltd.
Chongqing Expressway Group Co., Ltd.
Shanghai World Expo Land Holdings Co., Ltd.
Beijing Capital Tourism Group Co., Ltd.
CMB Financial Leasing Co., Ltd.
Shaanxi Coal Industry Chemical Group Co., Ltd.
China Securities Co., Ltd.
Guangdong Electric Power Development Co., Ltd.
China Construction Bank 
Industrial and Commercial Bank of China
Industrial and Commercial Bank of China Limited
China Securities Co., Ltd.
China Securities Co., Ltd.
China Bohai Bank
Shangrao Investment Holding Group SCP
China Securities Co., Ltd
Everbright Securities
Guangzhou Kaide Renewable Publicly Issued Corporate Bond
SCP/Guangzhou Development Zone Business Development Group
Qingdao City Investment Financial Holding Group Renewable Publicly Issued Corporate Bond
China Railway Construction Investment Group MTN
Qingdao Guoxin Development (Group) Co., Ltd.
China Securities Co., Ltd.
China Orient Asset Management Co., Ltd
    Datang International Power Generation Co.,Ltd.
Bank of China
Bank of China 
Datang International Power Generation Co.,Ltd. 
Hangzhou City Construction Investment Group Limited
YIBIN STATE OWNED ASSETS MANAGEMENT CO.,LTD.
China Railway Construction Investment Corporation
ABC Financial Leasing
Guangzhou Metro
Aluminum Corporation of China Limited
Fubon Bank
China Securities Co., Ltd.
Ganzhou Development Investment Holding Group
Shanghai rural Commercial Bank
Everbright Securities
ICBC Financial Leasing Co., Ltd
Shanghai Pudong Development Bank
China State Railway Group Co., Ltd.
China State Railway Group Co., Ltd.
CMB Financial Leasing
CMB Financial Leasing Co., Ltd.
Bank of China
Bank of China 
Industrial and Commercial Bank of China
Industrial and Commercial Bank of China
Industrial and Commercial Bank of China Limited
Industrial and Commercial Bank of China Limited
Bank of Communications Co.,Ltd.
Zhejiang State-owned Capital Operation Co., Ltd.
China Merchant Bank
China Merchants Bank
Bank of Communications Financial Leasing Co., Ltd.
CCB Financial Leasing Co., Ltd
Central Huijin Investment Ltd.
Central Huijin Investment Ltd.
China Securities Co., Ltd
Everbright Securities
Beijing Infrastructure Investment Co., LTD
Huishang Bank Corporation
Bank of Communication
China Nonferrous Metal Mining (Group) Co., Ltd
Everbright Securities
Industrial and Commercial Bank of China
Industrial and Commercial Bank of China Limited
China Securities Co., Ltd
China Everbright Bank Co., Ltd
Bank of China...so on

answered Mar 2, 2021 at 14:39

KunduK

33.4k5 gold badges19 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Joyce Over a year ago

Hello, it works fine, thanks for your help!

Joyce Over a year ago

may I ask why using textcontent but not .text?

KunduK Over a year ago

@Joyce : .text works when the element visible on the page. The application you are using you need to scroll the page to get element visible. That's the reason it is giving empty value. textcontent retrieve all hidden values if it is present in DOM. Hope I have answered your query. Please mark this as accepted and vote for it.Thanks.

Arundeep Chohan · Accepted Answer · 2021-03-02 18:23:55Z

1

"//table[@id='tb7']/tbody//tr[starts-with(@class,'{}')]".format(pg)

Try using this xpath for all items. To grab all tr values in td7 with that "tb2tr pg" + str(i) value.

You could just use

for item in allitems:
    issuer = item.find_element_by_xpath('./td[1]/div[2]/div').get_attribute('textContent'
    print(issuer)

edited Mar 2, 2021 at 18:23

answered Mar 2, 2021 at 11:21

Arundeep Chohan

9,9895 gold badges17 silver badges36 bronze badges

5 Comments

Joyce Over a year ago

Hi, glad you helped! but as I use

for i in range(1,20):         pg = "tb2tr pg" + str(i)         allitems = driver.find_element_by_xpath("//table[@id='td7']/tbody//tr[starts-with(@class,'{}')]".format(pg))

, it says unable to locate..not sure why

Joyce Over a year ago

I think there is an extra 'on' on tb2tr pg1 on, I changed to for i in range(2,20), but still not able to find

Joyce Over a year ago

thanks! but it return blank, does it mean the website cannot be scarped? besides may I ask why use tr[starts-with(@class,'{}') not tr[@class='{}']

Joyce Over a year ago

Hey, I tried with an example, it is nothing wrong with the website, but somehow I cannot get text returns but all blank with

for i in range(2,20):         pg = "tb2tr pg" + str(i)         driver.implicitly_wait(10)         allitems = driver.find_elements_by_xpath("//table[@id='tb7']/tbody//tr[starts-with(@class,'{}')]".format(pg))         for item in allitems:             issuer = item.find_element_by_xpath('./td[2]/div[2]/span').text             print(issuer)

Arundeep Chohan Over a year ago

Use get_attribute('textContent') instead of text

Geomario · Accepted Answer · 2021-03-02 10:22:25Z

0

correct me if I am wrong. I understand that you want to crawl the entire web page, which that means when you click, the page loads a new page. The Selenium web driver does not recognize new pages, and it focuses on the first page. You have to give it the instruction to do so. The way to solve this is:

from selenium.webdriver.support import expected_conditions as EC

# Start the driver
with webdriver.Firefox() as driver:
    # Open URL
    driver.get("https://seleniumhq.github.io")

    # Setup wait for later
    wait = WebDriverWait(driver, 10)

    # Store the ID of the original window
    original_window = driver.current_window_handle

    # Check we don't have other windows open already
    assert len(driver.window_handles) == 1

    # Click the link which opens in a new window
    driver.find_element(By.LINK_TEXT, "new window").click()

    # Wait for the new window or tab
    wait.until(EC.number_of_windows_to_be(2))

    # Loop through until we find a new window handle
    for window_handle in driver.window_handles:
        if window_handle != original_window:
            driver.switch_to.window(window_handle)
            break

    # Wait for the new tab to finish loading content

edited Mar 2, 2021 at 10:22

answered Mar 2, 2021 at 10:13

Geomario

2123 silver badges15 bronze badges

3 Comments

Joyce Over a year ago

Hi thanks, but the website put all links already in one website, I do not need click on next page

Geomario Over a year ago

I see, have you tried to print by element ?

Geomario Over a year ago

element = driver.find_element(By.TAG_NAME, "a")

Obed Campos Hernández · Accepted Answer · 2021-12-20 16:14:02Z

0

#This is used to make it wait for the page to load when a lot of resources are required and the page reloads in your window

loading = '//div[@values="elementVisible"]'

def createAudit(self):
     for count1 in range(1):
        count1 = len(self.driver.find_elements(By.XPATH, loading))
        count1 = int(count1)
        if count1 != 0:
            print("Page loaded correctly")
            time.sleep(3)
            break
        else:
            time.sleep(3)

edited Dec 20, 2021 at 16:14

answered Nov 11, 2021 at 15:46

Obed Campos Hernández

11 bronze badge

Collectives™ on Stack Overflow

Python Selenium using For Loop to access element

4 Answers 4

3 Comments

5 Comments

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

5 Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related