0

I would like to retrieve the contents of a javascript script instead of executing it upon requesting it.

EDIT: I understand that Python is not executing the javascript code. The issue is that when I request this online JS script it gets executed. I'm unable to retrieve the contents of the script. Maybe what I want is to decode the script like so http://jsunpack.jeek.org/dec/go

That's what my code looks like to request the js file:

def request(self, uri):
    data = None
    req = urllib2.Request(uri, data, self.header)
    response = urllib2.urlopen(req)
    html_text = response.read()
    return html_text.decode()

I know approximately what the insides of the script look like but all I get after the request is issued is a 'loaded' message. My guess is that the JS code gets executed. Is there any way to just request the code?

5
  • I'm a little bit confused. How is JavaScript going to get executed from Python? Python does not know how to execute JS (JS and Python are two totally different languages). Commented Aug 4, 2011 at 18:47
  • Are you talking about retrieving JSON data? Commented Aug 4, 2011 at 18:48
  • The script is executed client side. I'm guessing python is not executing it but whatever Python uses as a HTML interpreter / browser Commented Aug 4, 2011 at 18:49
  • No, Python is not executing the JavaScript. Commented Aug 4, 2011 at 18:51
  • I guess what I want is to decode the Javascript like so jsunpack.jeek.org/dec/go Commented Aug 4, 2011 at 18:53

2 Answers 2

2

There is no HTML or JavaScript interpreter in urllib2. This module does nothing but fetch the resource and return it to you raw; it certainly will not attempt to execute any JavaScript code it receives. If you are not receiving the response you expect, check the URL with a tool like wget or monitor the network connection with Wireshark or Fiddler to see what the server is actually returning.

(decode() here only converts the bytes of the HTTP response body to Unicode characters—using the default character encoding, which probably isn't a good idea.)

ETA:

I guess what I want is to decode the Javascript like so jsunpack.jeek.org/dec/go

Ah, well that's a different game entirely. You can get the source for that here, though you'll also need to install SpiderMonkey, the JavaScript engine from Mozilla, to allow it to run the downloaded JavaScript.

There's no way to automatically ‘unpack’ obfuscated JavaScript without running it, since the packing code can do anything at all and JS is a Turing-complete language. All this tool does is run it with some wrapper code for functions like eval which packers/obfuscators typically use. Unfortunately, this sabotage is easily detectable, so if it's malware you're trying to unpack you'll find this fails as often as it succeeds.

Sign up to request clarification or add additional context in comments.

Comments

1

I'm not sure I understand. If I do a simplified version of your code and run it on a URI that's sure to have some javascript:

>>> import urllib2
>>> res = urllib2.urlopen("http://stackoverflow.com/questions/6946867/how-to-unpack-javascript-in-python")

And you print res (or res.decode()), the javascript is intact.

Doing urlopen should retrieve whatever character stream the source provides. It's up to you to do something with it (render it as html, interpret it as javascript, etc).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.