MemoryError, python

Question

I got a MemoryError when processing a .xml file = 1,45 Gb. I tried to run it on a smaller file and it works, so there shouldn't be any bugs in the code. The code, itself, implies to open a xml file, do some stuff inside and save it back to a new txt file. I run Win7 x86, 2 Gb RAM, Python 2.6

Traceback (most recent call last):
  File "<pyshell#0>", line 1, in <module>
    openfile('ukwiki-latest-pages-articles.xml')
  File "C:\Users\Vof Freeman\Desktop\Python\test.py", line 7, in openfile
    contents = F.read()
  File "C:\Python26\lib\codecs.py", line 666, in read
    return self.reader.read(size)
  File "C:\Python26\lib\codecs.py", line 466, in read
    newdata = self.stream.read()
MemoryError

user225312 · Accepted Answer · 2010-11-07 16:17:24Z

8

Since building an in-memory tree is not desirable (and in your case not practical either, given the amount of physical memory you have), there are two techniques you can use with lxml:

Supplying a target parser class
Using the iterparse method

Refer to the documentation here to see how this can be done.

edited Nov 7, 2010 at 16:17

answered Nov 7, 2010 at 15:49

user225312

133k71 gold badges176 silver badges182 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Rafe Kettler · Accepted Answer · 2010-11-07 15:49:07Z

0

Simply put, you don't have enough RAM to read this file. You should split it up into smaller XML files and read it that way.

The fact that it worked on a smaller file tells me that there's nothing wrong with your code, it's just your hardware that can't handle it.

answered Nov 7, 2010 at 15:49

Rafe Kettler

77.1k21 gold badges160 silver badges151 bronze badges

2 Comments

Gusto Over a year ago

if I split it, how do I get a single txt file then on the output?

Rafe Kettler Over a year ago

Keep the same file object open while you're reading each XML file, and continue writing to it throughout the program rather than closing the file and opening up a new one.

Collectives™ on Stack Overflow

MemoryError, python

2 Answers 2

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related