How does Python read and interpret source files?

Question

Say I run a Python (2.7, though I'm not sure that makes a difference here) script. Instead of terminating the script, I tab out, or somehow switch back to my editing environment. I can then modify the script and save it, but this changes nothing in the still-running script.

Does Python load all source files into memory completely at launch? I am under the impression that this is how the Python interpreter works, but this contradicts my other views of the Python interpreter: I have heard that .pyc files serve as byte-code for Python's virtual machine, like .class files in Java. At the same time however, some (very few in my understanding) implementations of Python also use just-in-time compilation techniques.

So am I correct in thinking that if I make a change to a .py file while my script is running, I don't see that change until I re-run the script, because at launch all necessary .py files are compiled into .pyc files, and simply modifying the .py files does not remake the .pyc files?

If that is correct, then why don't huge programs, like the one I'm working on with ~6,550 kilobytes of source code distributed over 20+ .py files, take forever to compile at startup? How is the program itself so fast?

Additional Info:

I am not using third-party modules. All of the files have been written locally. The main source file is relatively small (10 kB), but the source file I primarily work on is 65 kB. It was also written locally and changes every time before launch.

Wait, 6MB of source code in 20 files?

NPE
– NPE

2014-08-07 18:46:50 +00:00
Commented Aug 7, 2014 at 18:46 — NPE
– NPE, Commented Aug 7, 2014 at 18:46
Are you using an IDE?

Nick Humrich
– Nick Humrich

2014-08-07 18:49:34 +00:00
Commented Aug 7, 2014 at 18:49 — Nick Humrich
– Nick Humrich, Commented Aug 7, 2014 at 18:49

Tim Pietzcker · Accepted Answer · 2014-08-07 18:50:15Z

4

Python loads the main script into memory, compiles it into bytecode and runs that. If you modify the source file in the meantime, you're not affecting the bytecode.

If you're running the script as the main script (i. e. by calling it like python myfile.py, then the bytecode will be discarded when the script exits.

If you're importing the script, however, then the bytecode will be written to disk as a .pyc file which won't be recompiled when imported again, unless you modify the corresponding .py file.

Your big 6.5 MB program consists of many modules which are imported by the (probably small) main script, so only that will have to be compiled at each run. All the other files will have their .pyc file ready to run.

answered Aug 7, 2014 at 18:50

Tim Pietzcker

337k59 gold badges520 silver badges572 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

user3745189 Over a year ago

I am not using third-party modules. All of the files have been written locally. The main source file is relatively small (10 kB), but the source file I primarily work on is 65 kB. It was also written locally and changes every time before launch. Basically, what I'm trying to get at is my relatively small main file is not compiled at each run, while one of my relatively large source files is compiled at each run.

Tim Pietzcker Over a year ago

I didn't say anything about third-party modules. Any Python script which is imported is a module. Since you said that your program consists of 20+ files, you probably have one main script and about 20 modules that may be imported by it. (And 65 KB is nothing, Python will compile that in an instant.)

user3745189 Over a year ago

Ah, thank you. So why does Python compile so much faster than C? Is it because C is compiled into native code while Python is compile into byte-code to be read by its virtual machine?

holdenweb Over a year ago

Presumably the relatively large file is recompiled every time because you change it, thereby requiring the recreation of the .pyc file. If you are worried about execution time (and it really doesn't seem you should be) then isolate the bits you change in a further importable module.

Tim Pietzcker Over a year ago

@user3745189: That's hard to compare. Python programs are much higher-level than C programs, so something you'd write as a Python one-liner might take hundreds of lines of sourcecode in C, and yes, compilation to bytecode is something else entirely than compiling something to machine language (and optimizing and linking it). After all, the reason why computation-intensive programs run so much faster in C is that Python may have to compile those instructions to machine language every time they are executed (unless they are already part of a C module).

|

holdenweb · Accepted Answer · 2014-08-07 18:56:38Z

2

First of all, you are indeed correct in your understanding that changes to a Python source file aren't seen by the interpreter until the next run. There are some debugging systems, usually built for proprietary purposes, that allow you to reload modules, but this bring attendant complexities such as existing objects retaining references to code from the old module, for example. It can get really ugly, though.

The reason huge programs start up so quickly is the the interpreter tries to create a .pyc file for every .py file it imports if either no corresponding .pyc file exists or if the .py is newer. The .pyc is indeed the program compiled into byte code, so it's relatively quick to load.

As far as JIT compilation goes you may be thinking of the PyPy implementation, which is written in Python and has backends in several different languages. It's increasingly being used in Python 2 shops where execution speed is important, but it's along way from the CPython that we all know and love.

answered Aug 7, 2014 at 18:56

holdenweb

37.8k7 gold badges62 silver badges80 bronze badges

2 Comments

Antimony Over a year ago

No need for proprietary extensions: reload is a builtin.

holdenweb Over a year ago

reload() may be a built-in (though you have to use other means in Python 3), but when you are debugging a real-time shoot-em-up space game client with C++ graphical extensions and you need to stop execution, edit a module and reload it, that's far from the end of your potential problems. I would suggest there are relatively few use cases for reload() in production code. The extensions are usually proprietary because they are so specific to a single code base it would be hard to generalize them.

Collectives™ on Stack Overflow

How does Python read and interpret source files?

2 Answers 2

6 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related