Parse javascript generated content using Java

Question

http://support.xbox.com/en-us/contact-us uses javascript to create some lists. I want to be able to parse these lists for their text. So for the above page I want to return the following:

Billing and Subscriptions
Xbox 360
Xbox LIVE
Kinect
Apps
Games

I was trying to use JSoup for a while before noticing it was generated using javascript. I have no idea how to go about parsing a page for its javascript generated content.

Where do I begin?

tskuzzy · Accepted Answer · 2012-07-02 17:26:40Z

1

You'll want to use an HTML+JavaScript library like Cobra. It'll parse the DOM elements in the HTML as well as apply any DOM changes caused by JavaScript.

answered Jul 2, 2012 at 17:26

tskuzzy

36.6k15 gold badges78 silver badges145 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Matt Westlake · Accepted Answer · 2012-07-02 17:24:57Z

1

you could always import the whole page and then perform a string separator on the page (using return, etc) and look for the string containing the information, then return the string you want and pull pieces out of that string. That is the dirty way of doing it, not sure if there is a clean way to do it.

answered Jul 2, 2012 at 17:24

Matt Westlake

3,6718 gold badges44 silver badges83 bronze badges

Comments

Bob Davies · Accepted Answer · 2012-07-02 17:32:56Z

0

I don't think that text is generated by javascript... If I disable javascript those options can be found inside the html at this location (a jquery selector just because it was easier to hand-write than figuring out the xpath without javascript enabled :))

'div#ShellNavigationBar ul.NavigationElements li ul li a'

Regardless in direct answer to your query, you'd have to evaluate the javascript within the scope of the document, which I expect would be rather complex in Java. You'd have more luck identifying the javascript file generating the relevant content and just parsing that directly.

answered Jul 2, 2012 at 17:32

Bob Davies

2,2821 gold badge19 silver badges28 bronze badges

Collectives™ on Stack Overflow

Parse javascript generated content using Java

3 Answers 3

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related