I'm relatively new to Node.js. I'm trying to convert 83 XML files that are each around 400MB in size into JSON.
Each file contains data like this (except each element has a large number of additional statements):
<case-file>
<serial-number>75563140</serial-number>
<registration-number>0000000</registration-number>
<transaction-date>20130101</transaction-date>
<case-file-header>
<filing-date>19981002</filing-date>
<status-code>686</status-code>
<status-date>20130101</status-date>
</case-file-header>
<case-file-statements>
<case-file-statement>
<type-code>D10000</type-code>
<text>"MUSIC"</text>
</case-file-statement>
<case-file-statement>
<type-code>GS0351</type-code>
<text>compact discs</text>
</case-file-statement>
</case-file-statements>
<case-file-event-statements>
<case-file-event-statement>
<code>PUBO</code>
<type>A</type>
<description-text>PUBLISHED FOR OPPOSITION</description-text>
<date>20130101</date>
<number>28</number>
</case-file-event-statement>
<case-file-event-statement>
<code>NPUB</code>
<type>O</type>
<description-text>NOTICE OF PUBLICATION</description-text>
<date>20121212</date>
<number>27</number>
</case-file-event-statement>
</case-file-event-statements>
I have tried a lot of different Node modules, including sax, node-xml, node-expat and xml2json. Obviously, I need to stream the data from the file and pipe it through an XML parser and then convert it to JSON.
I have also tried reading a number of blogs, etc. attempting to explain, albeit superficially, how to parse Xml.
In the Node universe, I tried sax first but I can't figure out how to extract the data in a format that I can convert it to JSON. xml2json won't work on streams. node-xml looks encouraging but I can't figure out how it parses chunks in any manner that makes sense. node-expat points to libexpat documentation, which appears to requires a Ph.D. Node elementree does the same, pointing to the Python implementation but doesn't explain what has been implemented or how to use it.
Can someone point me to example that I could use to get started?