0

When I inspect the code on a webpage, I can see the html and the javascript. I've used Beautiful Soup to import and parse the html, but there is a large section written in javascript, which pulls variables from a programmable logic controller (PLC). I can't find the data in python after I load and parse with Beautiful Soup - it's only the html code.

The PLC is being read directly by the webpage and I see the live values updating in front of me, but I can't import them directly. The screen shot is what the code looks like from the inspect window. Let's say I want to import that variable id="aout7" with attribute class="on", how can I do that?

Inspect View Source of webpage

1 Answer 1

1

Webpages are best run in a browser. There are API-s for remote controlling a browser/browser engine, a popular one is Selenium, and it has Python bindings: see https://pypi.org/project/selenium/ - the page contains instructions for installing:

pip install -U selenium

and some introductory examples, like this snippet issuing a Yahoo search:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

browser = webdriver.Firefox()

browser.get('http://www.yahoo.com')
assert 'Yahoo' in browser.title

elem = browser.find_element_by_name('p')  # Find the search box
elem.send_keys('seleniumhq' + Keys.RETURN)

browser.quit()

You will need something similar, just with find_element_by_id (https://selenium-python.readthedocs.io/locating-elements.html), and use the text attribute of elements to read their content.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.