0

I'm trying to scrape this website Best Western Mornington Hotel

for the name of hotel rooms and the price of said room. I'm using Selenium to try and scrape this data but I keep on getting no return after what I assume is me using the wrong selectors/XPATH. Is there any method of identifying the correct XPATH/div class/selector? I feel like I have selected the correct ones but there is no output.

from re import sub
from decimal import Decimal
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time

seleniumurl = 'https://www.bestwestern.co.uk/hotels/best-western-mornington-hotel-london-hyde-park-83187/in-2021-06-03/out-2021-06-05/adults-1/children-0/rooms-1'



driver = webdriver.Chrome(executable_path='C:\\Users\\Conor\\Desktop\\diss\\chromedriver.exe')
driver.get(seleniumurl)
time.sleep(5)
working = driver.find_elements_by_class_name('room-type-block')

for work in working:
    name = work.find_elements_by_xpath('.//div/h4').string
    price = work.find_elements_by_xpath('.//div[2]/div[2]/div/div[1]/div/div[3]/div/div[1]/div/div[2]/div[1]/div[2]/div[1]/div[1]/span[2]').string
    print(name,price)

3 Answers 3

1

I only work with Selenium in Java, but from I can see you're trying to get collection of WebElements and invoke toString() on them...

should be that find_element_by_xpath to get just one WebElement and then call .text instead of .string?

Sign up to request clarification or add additional context in comments.

Comments

1

Marek is right use .text instead of .string. Or use .get_attribute("innerHTML"). I also think your xpath may be wrong unless I'm looking at the wrong page. Here are some xpaths from the page you linked.

#This will get all the room type sections.
roomTypes = driver.find_elements_by_xpath("//div[contains(@class,'room-type-box__content')]")

#This will get the room type titles
roomTypes.find_elements_by_xpath("//div[contains(@class,'room-type-title')]/h3")

#Print out room type titles
for r in roomTypes:
    print(r.text)

Comments

0

Please use this selector div#rr_wrp div.room-type-block and .visibility_of_all_elements_located method for get category div list.

With the above selector, you can search title by this xpath: .//h2[@class="room-type--title"], sub category by .//strong[@class="trimmedTitle rt-item--title"] and price .//div[@class="rt-rate-right--row group"]//span[@data-bind="text: priceText"].

And please try the following code with zip loop to extract parallel list:

driver = webdriver.Chrome(executable_path='C:\\Users\\Conor\\Desktop\\diss\\chromedriver.exe')

driver.get('https://www.bestwestern.co.uk/hotels/best-western-mornington-hotel-london-hyde-park-83187/in-2021-06-03/out-2021-06-05/adults-1/children-0/rooms-1')

wait = WebDriverWait(driver, 20)

elements = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, 'div#rr_wrp div.room-type-block')))
for element in elements:
    for room_title in element.find_elements_by_xpath('.//h2[@class="room-type--title"]'):
        print("Main Title ==>> " +room_title.text)
        for room_type, room_price in zip(element.find_elements_by_xpath('.//strong[@class="trimmedTitle rt-item--title"]'), element.find_elements_by_xpath('.//div[@class="rt-rate-right--row group"]//span[@data-bind="text: priceText"]')) :
            print(room_type.text +" " +room_price.text)
            
driver.quit()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.