2

I want to extract some data from a Javascript rendered page using Selenium web driver in Python3. I have try several driver, such as Firefox, Chromedriver, and PhantomJS, but always get the same result. Instead of the DOM element, I only got the script.

Here is the snippet of my code

url = 'https://www.google.com/flights/explore/#explore;f=BDO;t=r-Asia-0x88d9b427c383bc81%253A0xb947211a2643e5ac;li=0;lx=2;d=2018-01-09'
driver = webdriver.Chrome("/var/chromedriver/chromedriver")
driver.implicitly_wait(20)
driver.get(url)

print(driver.page_source)

Do I miss something here ?

2
  • Do you have an error message? Push your traceback message in post. Commented Jan 2, 2018 at 11:33
  • There is no error message when I execute those codes. It just give me an unexpected result Commented Jan 3, 2018 at 10:40

2 Answers 2

1

I don't see any such issues in your code block. I have tried your own script as follows :

from selenium import webdriver

url = 'https://www.google.com/flights/explore/#explore;f=BDO;t=r-Asia-0x88d9b427c383bc81%253A0xb947211a2643e5ac;li=0;lx=2;d=2018-01-09'
driver = webdriver.Chrome()
driver.get(url)
print(driver.page_source)

I get the following Console Output :

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US">

<head>
  <meta http-equiv="content-type" content="text/html; charset=UTF-8" />
  <meta name="deals::gwt:property" content="baseUrl=/flights/explore//static/" />
  <title>Explore flights</title>
  <meta name="description" content="Explore flights" />
  <script src="https://apis.google.com/_/scs/abc-static/_/js/k=gapi.gapi.en.yoTdpQipo6s.O/m=gapi_iframes,googleapis_client,plusone/rt=j/sv=1/d=1/ed=1/am=AAE/rs=AHpOoo9_VhuRoUovwpPPf5LqLZd-dmCnxw/cb=gapi.loaded_0" async=""></script>
  <script language="javascript" type="text/javascript">
    var __JS_ILT__ = new Date();
    .
    .
    . <
    /div></div > < div aria - hidden = "true"
    style = "display: none;" > < div class = "CTPFVNB-l-j CTPFVNB-l-h" > Displayed currencies may differ from the currencies used to purchase flights.– < a href = "https://www.google.com/intl/en/googlefinance/disclaimer/"
    class = "CTPFVNB-l-k" > Disclaimer < /a></div > < /div><div aria-hidden="true" style="display: none;"><div class="CTPFVNB-l-j CTPFVNB-l-h">Showing licensed rail data. – <a href="https:/ / www.google.com / intl / en / help / legalnotices_maps.html " class="
    CTPFVNB - l - k ">Legal Notice</a></div></div><div class="
    CTPFVNB - l - i "><a class="
    CTPFVNB - l - k CTPFVNB - l - j " href="
    https: //www.google.com/intl/en/policies/">Privacy &amp; Terms</a><a class="CTPFVNB-l-k CTPFVNB-l-j" href="https://support.google.com/flights/?hl=en">Help Center</a></div></div></div><iframe id="deals" tabindex="-1" style="position: absolute; width: 0px; height: 0px; border: none; left: -1000px; top: -1000px;">
</iframe><input type="text" id="_bgInput" style="display:none;" /></body></html>

Now, as you can clearly see at the fag end of the page_source there is an iframe. So untill and unless we switch to the iframe you won't be able to find the DOM element you are looking for.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for your explanation. But, the problem is the output from the page_source is different from what I got if inspect the page. For example, I want to take all the price available. When I try to parse it from the page_source, it will return nothing because the price is not included there. If I see in the inspected element, the price is exist outside the iframe tag.
Yes, you are right. Switch to the iframe and take page_source, you will find it all. I didn't observe any price being mentioned in the question. Feel free to raise a new question as per your new requirement. If my Answer have catered to your Question please Accept the Answer.
0

use helium a selenium wraper

# pip install helium
import helium, time
url_one = "https://www.vbiz.in/nseoptionchain.html"
browser_one = helium.start_chrome(url_one, headless=True)
seconds = 5
time.sleep(seconds)
html = browser_one.page_source
browser_one.close()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.