0

I'm encountering a peculiar issue when using Selenium in a Python script to interact with a webpage. Specifically, I'm trying to scrape data from this webpage.

Python code:

from selenium import webdriver
from selenium.webdriver.common.by import By
import time

link = "https://www.homedepot.com/p/Ejoy-20-in-H-x-20-in-W-GorgeousHome-Artificial-Boxwood-Hedge-Greenery-Panels-Milan-12-pc-Milan-12pc/314160722"

# Launch a Chrome browser
options = webdriver.ChromeOptions()
driver = webdriver.Chrome(options=options)
driver.get(link)

# Wait for the page to load
time.sleep(3)

# Find the drop-down element
dropdown = driver.find_element(By.XPATH, "//*[@id='product-section-specifications']")
dropdown.click()

# Attempt to find elements within the drop-down
headers = dropdown.find_elements(By.CSS_SELECTOR, "div[class='kpf__name']")
print(len(headers))
print(headers[0].text)

The issue I'm encountering is that while the page appears to load properly, certain elements, specifically those within a drop-down, disappear before I can interact with them. Strangely, when I manually open the same webpage in my personal Chrome browser, everything loads and remains visible as expected.

I initially suspected that a script might be removing the content, but upon comparing the console output between Selenium and my personal browser, they show the same "error" output, leading me to believe that this might not be the cause of the issue.

1 Answer 1

0

The site has some sort of anti-bot protection on. A lot of sites are doing this these days to prevent scraping of the site, etc.


Having said that, I rewrote your script to be cleaner and more efficient in case you wanted some suggestions for improvement...

  1. You don't need to define ChromeOptions() if you aren't going to use them. You can simply use

    driver = webdriver.Chrome()
    
  2. Don't use time.sleep(). It's not a good practice and is considered a "dumb" wait. It always waits the specified time even if the element is available earlier. Use WebDriverWait and wait for specific conditions instead, e.g. wait for clickable if you are going to click something and wait for visible if you are going to interact with an element in some other way.

With all this in mind, here's the rewritten script

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

url = "https://www.homedepot.com/p/Ejoy-20-in-H-x-20-in-W-GorgeousHome-Artificial-Boxwood-Hedge-Greenery-Panels-Milan-12-pc-Milan-12pc/314160722"

# Launch a Chrome browser
driver = webdriver.Chrome()
driver.get(url)

wait = WebDriverWait(driver, 10)

# Open the Specification accordion element
wait.until(EC.element_to_be_clickable((By.ID, "product-section-key-feat"))).click()

# Find elements in the Specification section
headers = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.kpf__name")))
print(len(headers))
for header in headers:
    print(header.text)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.