0

I'm pretty new to programming generally, so I'm just trying to write a fun program to do some webscraping. My gf and I are playing Animal Crossing, and are trying to play the turnip game. There's a web-page where people list the turnip prices on their islands. I'd like to write code that scrapes the page, identifies how many bells everyone is selling for, and then notifies me via text or email if anyone lists over 500 bells.

I'm stuck on step 1 here.

I'd like to scrape the HTML of the page, and identify the bells using that. I initially tried with BS4, but found that since the page is dynamic and uses some dynamic java elements, I had to use selenium instead.

Here is the HTML I'm trying to identify:

<\div data-v-dee358f6="" class="flex flex-row items-center justify-self-center">
    <\img data-v-dee358f6="" src="/img/turnip.0cf2478d.png" class="w-6 object-scale-down">
    <\p data-v-dee358f6="" class="ml-2">73 Bells<\p>
<\div>

I'd like to scrape anything of class ml-2 so I can pull the code that has the portion listing the bells. I've used the basic code as follows to try various methods to do this:

#Turnip notifier
#Reads the island page on the turnip exchange and sends a text message when an island goes above 500 bells

from selenium.webdriver import Firefox

webdriver = 'C:\\path'

driver = Firefox(webdriver)

#Open up turnip.exchange URL

url = "https://turnip.exchange/islands"

driver.get(url)

element = driver.find_element_by_class_name('ml-2')

HTML = element.get_attribute('outerHTML')

print(HTML)

This returns HTML but of a different class. I then tried by CSS selector, xPATH, etc... etc... each of which stated that there was no element.

I then tried to pull the HTML of the entire page, just to see what I'm working with, so my code now looks like this:

#Turnip notifier
#Reads the island page on the turnip exchange and sends a text message when an island goes above 500 bells

from selenium.webdriver import Firefox

webdriver = 'C:\\path'

driver = Firefox(webdriver)

#Open up turnip.exchange URL

url = "https://turnip.exchange/islands"

driver.get(url)

HTML = driver.execute_script("return document.documentElement.outerHTML;")

print(HTML)

This prints HTML, but not for the page as it looks live. It appears to be mostly formatting and things like that. So it seems I'm still not grabbing the live page as it appears in inspect element, even using Selenium to open the site.

Any ideas? Once I can pull the code which contains the number of bells, I'm pretty sure I have an idea of where to go from there in terms of creating a list/dictionary and storing the values, but I can't actually find the bells currently.

1 Answer 1

1

If you want to get a list of all the bells listings, you can get that from this:

bells_list = WebDriverWait(driver, 30).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".note p.ml-2")))

for bells in bells_list:
    print(bells.text)

Add this right after your driver.get(url) line. This will use a wait to wait until the elements are ready to be found and to have the information retrieved.

You will need to add these imports:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Sign up to request clarification or add additional context in comments.

2 Comments

Fantastic! This did the trick. No I just have to look at the code and figure out how it works :)
Please accept my answer if all is good and please let me know if you have additional questions.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.