How to avoid OutOfMemoryError in Java?

Question

I am new to Java. I have this 2 GB xml file which I need to parse and store its data into a database.

Someone on StackOverflow recommended me to use Dom4j for long xml files. Parsing is doing good, but returned Document (by Dom4j) is very long and on iteration loads all DOM objects into memory (heap).

This results into out-of-memory anomalies. Can somebody please help me how to avoid such errors? Do we have some phenomenon in Java for on-demand heap allocation and deposition in Java?

Is SAX or StAx an option for this? Do you need all data in memory? — Uwe Plonus
– Uwe Plonus, Commented Jun 10, 2013 at 9:55
Quickest solution: run your Java app with more memory (try using 4 GB). Mode detailed solutions: do not keep the whole XML in memory (since it won't fit), instead process it by chunks. — Luiggi Mendoza
– Luiggi Mendoza, Commented Jun 10, 2013 at 9:55

Community · Accepted Answer · 2017-05-23 12:22:42Z

5

You have two choices:

reconfigure your JVM to allocate more maximum memory (via -Xmx2g or similar). See here for more info. This option is obviously limited also by your OS and the amount of free memory in your system.
use a streaming API (such as SAX) that doesn't load all the XML into your memory at once, but rather streams it through your process, allowing you to analyse it without holding the entire doc in memory

The first option may help you immediately, and isn't specific to this question. The second option is the more scalable solution since it'll allow you to analyse documents of any size. Of course you need to worry about the memory consumption of the results of your analysis, but that's another matter entirely.

edited May 23, 2017 at 12:22

CommunityBot

11 silver badge

answered Jun 10, 2013 at 9:56

Brian Agnew

273k38 gold badges342 silver badges443 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user2139064 Over a year ago

Thanks Brian, increasing heap size is of-course known to me and processing XML in chunks is good suggestion. But I need some generic solution for avoiding too much data getting loaded in heap. Related problem was there for a large table too - with around 15000 records. In that too some said to use cursors. But these solutions seems to be contextual - is there any generic solution or guidelines for avoiding out-of-memory anamolies? Also Dom4j has a SAX parser.

Juned Ahsan · Accepted Answer · 2013-06-10 09:57:07Z

1

If you need to parse big XML files (and adding to the Java heap does not always work), you need a SAX parser which allows you to parse the XML stream instead of loading the whole DOM tree into memory.

You may also check SAXDOMIX

SAXDOMIX contains classes that can forward SAX events or DOM sub-trees to your application during the parsing of an XML document. The framework defines simple interfaces that allow the application to get DOM sub-trees in the middle of a SAX parsing. After handling, all DOM sub-trees become eligible for garbage collection. This solves the DOM scalability problem.

answered Jun 10, 2013 at 9:57

Juned Ahsan

68.9k12 gold badges101 silver badges138 bronze badges

3 Comments

user2139064 Over a year ago

Thanks Juned, I am using Dom4j and I think they also have a SAX parser. As one of the code snippet says - SAXReader reader = new SAXReader();

Juned Ahsan Over a year ago

With DOM problem is that the entire xml tree need to be loaded in memory. No matter how big heap size u set and if ur tree does not fit in it, you will end up with Out of memory error. SAX is better for parsing big xml, as you can read in chunks. I like SAXDOMIX as it mixes sax and dom to allow u parse in chunks and with ease. Try that.

user2139064 Over a year ago

DOM (as output) is being used intentionally as many of the xml nodes are inter-dependent and fully SAX is making the processing really slow. Doesn't SAX parser in Dom4j do the same job?

Collectives™ on Stack Overflow

How to avoid OutOfMemoryError in Java?

2 Answers 2

1 Comment

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related