3

I'm new to Javascript and trying to parse through it using Python but i've been giving it a go using BeautifulSoup along with Requests to extract the 'file' line out of the 'RT.currentVideo' section of this script, but i can't seem to. I'm completly lost as to how i'd even be able to store this section of the webpage as it doesn't have an identifier like most other questions related to this i've found online.

Any help would really be appreciated, thanks for taking the time to check in!

This is what i've been using to read the page:

url = "http://roosterteeth.com/episode/rt-docs-connected-connected-official-trailer"
req = Request(url, headers={'User-Agent': 'Mozilla/5.0', 'Accept-Encoding': 'utf-8'})
response = urlopen(req)
webpage = BeautifulSoup(response.read().decode('utf-8', 'ignore'), "html.parser")

And this is the Javascript block on the page i want to extract info from. Again, what i'm looking to get is the string in the 'file' variable.

<script>
    RT.currentVideo = {
      authUser: 0,
      autoPlay: 1,
      csrfToken: 'H240Yw8x9oYasUw2Tzt3qpwzA14Z1ajRjuXo6RV1',
      endPoint: 89,
      desktopAgent: 1,
      file: 'https://rtv2-video.roosterteeth.com/uploads/videos/0e840b4f-a188-440d-adc0-b78093c1009f/index.m3u8',
1
  • 1
    That's not JSON in that tag, it is Javascript. Commented Jan 16, 2018 at 6:52

2 Answers 2

2

You can use regex to extract that from the page html.

import re
regex = r"file:\s*?'(.+)'"

matches = re.findall(regex, webpageHtmlString)

print(matches[0])

webpageHtmlString should be the html of the page as string.

Sign up to request clarification or add additional context in comments.

Comments

0

Use PyQuery to get jquery like querying on html content using python.

from pyquery import PyQuery as pq

scripttags = pq('src') ## will output a list of script tags

print(scriptTags[0].src)

Based on your content you can use Jquery like querying

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.