0

I have a 10+ MB xml file with consists of node(about 10K to 20K) with relations.

<.....>
<Emplyoyee>
    <name>Jack</name>
    <age>35</age>
    <supervisor></supervisor>
    <....>
</Emplyoyee>
<.....>
<.....>
<.....>
<Emplyoyee>
    <name>Smith</name>
    <age>20</age>
    <supervisor>Jack</supervisor>
    <....>
</Emplyoyee>
<.....>

Now, I want to parse this file and store all the details in DB with "Employee" table which has a field(ID) called "supervisorID". Until now I have tried to make a List of all the employees and then iterating the List for finding supervisor relation.

Please suggest me a memory efficient and faster way to do this. What libraries can I use to handle this type of problems.

3
  • you mean you have parsed it using, lets say the Documentfactory in java, but you need a more efficient way? Commented Sep 19, 2012 at 19:51
  • What happens when you have two "emplyoyees" [sic] with the same name and both are supervisors? How to you link their subordinates to the correct one? Commented Sep 19, 2012 at 20:02
  • @JimGarrison The employee names correspond to xxxx in [email protected] so are unique. I've put these names for simplicity. Commented Sep 19, 2012 at 20:19

2 Answers 2

1

Look at MOXy framework provided by EclipseLink. It uses JAXB implementation behind the scenes in fact. But does also the ORM stuff with JPA.

Sign up to request clarification or add additional context in comments.

Comments

1

You can convert data from XML file to Java Objects using JAXB and insert Java objects to database using Hibernate + JPA.
You can create 2 DTO
Emplyoyee - with all info about Emplyoyee (name, age, ...)
and
Emplyoyees with List<Emplyoyee> for JAXB unmarshalling

EDIT: WITHOUT JAXB and JPA

You can parse file using javascript and send SQL queries usinf Ajax

var xmlDoc = new ActiveXObject("MSXML.DOMDocument");  
xmlDoc.async = false;
xmlDoc.preserveWhiteSpace = true;
xmlDoc.load(pathToFile);
var nodes = xmlDoc.selectNodes("/Emplyoyee");  
for (var node = nodes.nextNode(); node != null; node = nodes.nextNode())
{
   // get another nodes, create SQL query and sent it to server usinj Ajax  
}

3 Comments

When using JAXB xjc does most of the work for you. There are many tools that will create an XSD for you given a file sample which will get you most of what you need to use with JAXB (I tend to use the XML tools in intelliJ: jetbrains.com/idea/webhelp/…). JAXB is more than capable of doing this for a 10MB file, I perform a similar task on a 30MB file to get daily supplier updates. Oh and here is a good JAXB primer oracle.com/technetwork/articles/javase/index-140168.html
You can also start from the JPA entities and add JAXB annotations to map those classes to the desired XML format.
I could not use Hibernate for some reasons, and I want it to be done with in 5 secs (Parsing + Persistence) and thought of threads to make it faster persistence but the fact that I need to parse all the nodes before searching for relations. Also the number of employees entities may grow to 50K-60K.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.