Cannot get HTML to match inspect page even using selenium - Python 3

Question

I'm pretty new to programming generally, so I'm just trying to write a fun program to do some webscraping. My gf and I are playing Animal Crossing, and are trying to play the turnip game. There's a web-page where people list the turnip prices on their islands. I'd like to write code that scrapes the page, identifies how many bells everyone is selling for, and then notifies me via text or email if anyone lists over 500 bells.

I'm stuck on step 1 here.

I'd like to scrape the HTML of the page, and identify the bells using that. I initially tried with BS4, but found that since the page is dynamic and uses some dynamic java elements, I had to use selenium instead.

Here is the HTML I'm trying to identify:

<\div data-v-dee358f6="" class="flex flex-row items-center justify-self-center">
    <\img data-v-dee358f6="" src="/img/turnip.0cf2478d.png" class="w-6 object-scale-down">
    <\p data-v-dee358f6="" class="ml-2">73 Bells<\p>
<\div>

I'd like to scrape anything of class ml-2 so I can pull the code that has the portion listing the bells. I've used the basic code as follows to try various methods to do this:

#Turnip notifier
#Reads the island page on the turnip exchange and sends a text message when an island goes above 500 bells

from selenium.webdriver import Firefox

webdriver = 'C:\\path'

driver = Firefox(webdriver)

#Open up turnip.exchange URL

url = "https://turnip.exchange/islands"

driver.get(url)

element = driver.find_element_by_class_name('ml-2')

HTML = element.get_attribute('outerHTML')

print(HTML)

This returns HTML but of a different class. I then tried by CSS selector, xPATH, etc... etc... each of which stated that there was no element.

I then tried to pull the HTML of the entire page, just to see what I'm working with, so my code now looks like this:

#Turnip notifier
#Reads the island page on the turnip exchange and sends a text message when an island goes above 500 bells

from selenium.webdriver import Firefox

webdriver = 'C:\\path'

driver = Firefox(webdriver)

#Open up turnip.exchange URL

url = "https://turnip.exchange/islands"

driver.get(url)

HTML = driver.execute_script("return document.documentElement.outerHTML;")

print(HTML)

This prints HTML, but not for the page as it looks live. It appears to be mostly formatting and things like that. So it seems I'm still not grabbing the live page as it appears in inspect element, even using Selenium to open the site.

Any ideas? Once I can pull the code which contains the number of bells, I'm pretty sure I have an idea of where to go from there in terms of creating a list/dictionary and storing the values, but I can't actually find the bells currently.

RKelley · Accepted Answer · 2020-04-21 02:31:52Z

1

If you want to get a list of all the bells listings, you can get that from this:

bells_list = WebDriverWait(driver, 30).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".note p.ml-2")))

for bells in bells_list:
    print(bells.text)

Add this right after your driver.get(url) line. This will use a wait to wait until the elements are ready to be found and to have the information retrieved.

You will need to add these imports:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

answered Apr 21, 2020 at 2:31

RKelley

1,1198 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

JetJaguar124 Over a year ago

Fantastic! This did the trick. No I just have to look at the code and figure out how it works :)

RKelley Over a year ago

Please accept my answer if all is good and please let me know if you have additional questions.

Collectives™ on Stack Overflow

Cannot get HTML to match inspect page even using selenium - Python 3

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related