How to fetch complete webpage (using javascript) in python

Question

I'm trying to use urllib2 to fetch webpage from a website. After I managed to log on and retrieve the page, I found out the page has some <script>.....</script> inside. How can I save the rendered the output (the complete content of the webpage, not the script)?

Are you saying you'd like to save the content of the page, after any included Javascript has been run? — Matt Luongo
– Matt Luongo, Commented Feb 4, 2012 at 17:42
Are you doing this for testing, screen-scraping for an application, or what? In general, with JavaScript it's the browser that creates the page content, so you need a real browser to duplicate that... — Bill Gribble
– Bill Gribble, Commented Feb 4, 2012 at 17:44
@MattLuongo Yes, I'm trying to pull some of my personal message from a website which doesn't offer an API. — Terry Shi
– Terry Shi, Commented Feb 4, 2012 at 17:47

shadyabhi · Accepted Answer · 2012-02-04 17:59:51Z

3

Javascript can't be easily handled if you are using urllib.

What you need is a headless browser, for ex. WebKit.

A simple example can be found here.

If you don't want yourself to be limited to python, try Phantomjs

answered Feb 4, 2012 at 17:59

shadyabhi

17.4k28 gold badges86 silver badges135 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Matt Luongo · Accepted Answer · 2012-02-04 18:29:17Z

1

I'd also like to mention pywebkitgtk (which I've been using a lot lately as an embedded browser), and Selenium.

answered Feb 4, 2012 at 18:29

Matt Luongo

14.9k6 gold badges55 silver badges64 bronze badges

1 Comment

Terry Shi Over a year ago

Selenium with an actual browser driver is very useful, can mimic most human interactions.

Collectives™ on Stack Overflow

How to fetch complete webpage (using javascript) in python

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related