1

I am trying to get href from the link, please find my codes:

url ='http://money.finance.sina.com.cn/bond/notice/sz149412.html'
link = driver.find_element_by_xpath("//div[@class='blk01'])//ul//li[3]//a[contains(text(),'发行信息']").get_attribute('href')
print(link)

error

 invalid selector: Unable to locate an element with the xpath expression 
SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//div[@class='blk01'])//ul/li[3]//a[contains(text(),'发行信息']' is not a valid XPath expression.

Seems it is not a valid xpath, but I cannot figure out the error, any help will be appreciated!

Thanks

2
  • can you show us the error output? Commented Mar 29, 2021 at 9:47
  • please find my updated question Commented Mar 29, 2021 at 9:48

5 Answers 5

2
//a[contains(text(),'发行信息')]

Even this would work.

print(link.get_attribute("href"))
Sign up to request clarification or add additional context in comments.

Comments

1

try this instead:

link = driver.find_element_by_xpath('//div[@class="blk01"]//ul//li[3]//a[contains(text(), "发行信息")]')
print(link.get_attribute("href"))


2 Comments

Hi , thank you for your help! it worked. May I also ask is get_attribute() not the same syntax as .text()? so I cannot use this
element.text not text().
0
# Importing necessary modules
from seleniumwire import webdriver
from webdriver_manager.chrome import ChromeDriverManager
import time

# WebDriver Chrome
driver = webdriver.Chrome(ChromeDriverManager().install())

# Target URL
url = 'http://money.finance.sina.com.cn/bond/notice/sz149412.html'
driver.get(url)
time.sleep(5)
link = driver.find_element_by_xpath('//*[@class="blue" and contains(text(),"发行信息")]').get_attribute('href')
print(link)

3 Comments

thank you so much! it worked, may I ask why my code does not work? ('//a[contains(text(),"发行信息"]')
@ur welcome, u did mess up with ' and ' that's why it was invalid in ur case
@Joyce please consider accepting one of the solutions, to close this question.
0
//div[@class='blk01'])//ul//li[3]//a[contains(text(),'发行信息']

does not seem to be a stable xpath and also you mess up with ' and ". This is the main problem.

Try this first:

find_element_by_xpath('//div[@class="blk01"])//ul//li[3]//a[contains(text(),"发行信息"]')

If it works, try just:

find_element_by_xpath('//a[contains(text(),"发行信息"]')

The goal is to make xpath as short as possible.

Comments

0

Any particular reason to use Selenium here? It's present in the html source, so would be more efficient to use requests and beautifulsoup.

import requests
from bs4 import BeautifulSoup

url = 'http://money.finance.sina.com.cn/bond/notice/sz149412.html'
response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')


a_tag = soup.select_one('a:contains("发行信息")') 
#a_tag = soup.select_one('a:-soup-contains("发行信息")') # <- depending what version of bs4 you have, the above may throw error since it's depricated

link = a_tag['href']

Ouput:

print(link)
http://money.finance.sina.com.cn/bond/issue/sz149412.html

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.