9

I would like to convert XML into JSON (concretely, a OAI-PMH response). I am currently using node.js xml2js, but the issue is that JSON is very verbose, with way to many levels of nesting and arrays, even when there is only one element as a child and will never be more than one. The issue is that xml2js does not know anything about the schema of the XML file, so it has to be conservative.

My question is, is there any other (preferably JavaScript) code which would use XML Schema to guide conversion process? So if schema defines types and structure of XML, that than JSON takes advantage of this and have correct types automatically, and not unnecessary array levels.

6 Answers 6

2

At the end, I decided just to implement such a package: xml4js.

So for XML taken from XML Primer:

<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20" xmlns="http://www.example.com/PO">
   <shipTo country="US">
      <name>Alice Smith</name>
      <street>123 Maple Street</street>
      <city>Mill Valley</city>
      <state>CA</state>
      <zip>90952</zip>
   </shipTo>
   <billTo country="US">
      <name>Robert Smith</name>
      <street>8 Oak Avenue</street>
      <city>Old Town</city>
      <state>PA</state>
      <zip>95819</zip>
   </billTo>
   <comment>Hurry, my lawn is going wild!</comment>
   <items>
      <item partNum="872-AA">
         <productName>Lawnmower</productName>
         <quantity>1</quantity>
         <USPrice>148.95</USPrice>
         <comment>Confirm this is electric</comment>
      </item>
      <item partNum="926-AA">
         <productName>Baby Monitor</productName>
         <quantity>1</quantity>
         <USPrice>39.98</USPrice>
         <shipDate>1999-05-21</shipDate>
      </item>
   </items>
</purchaseOrder>

Without using a XML Schema to guide a conversion process, with explicit arrays turned on, you would get:

{
  "purchaseOrder": {
    "$": {
      "orderDate": "1999-10-20",
      "xmlns": "http://www.example.com/PO"
    },
    "shipTo": [
      {
        "$": {
          "country": "US"
        },
        "name": [
          "Alice Smith"
        ],
        "street": [
          "123 Maple Street"
        ],
        "city": [
          "Mill Valley"
        ],
        "state": [
          "CA"
        ],
        "zip": [
          "90952"
        ]
      }
    ],
    "billTo": [
      {
        "$": {
          "country": "US"
        },
        "name": [
          "Robert Smith"
        ],
        "street": [
          "8 Oak Avenue"
        ],
        "city": [
          "Old Town"
        ],
        "state": [
          "PA"
        ],
        "zip": [
          "95819"
        ]
      }
    ],
    "comment": [
      "Hurry, my lawn is going wild!"
    ],
    "items": [
      {
        "item": [
          {
            "$": {
              "partNum": "872-AA"
            },
            "productName": [
              "Lawnmower"
            ],
            "quantity": [
              "1"
            ],
            "USPrice": [
              "148.95"
            ],
            "comment": [
              "Confirm this is electric"
            ]
          },
          {
            "$": {
              "partNum": "926-AA"
            },
            "productName": [
              "Baby Monitor"
            ],
            "quantity": [
              "1"
            ],
            "USPrice": [
              "39.98"
            ],
            "shipDate": [
              "1999-05-21"
            ]
          }
        ]
      }
    ]
  }
}

But the package gives you this:

{
  "purchaseOrder": {
    "$": {
      "orderDate": "1999-10-20T00:00:00.000Z"
    },
    "shipTo": {
      "$": {
        "country": "US"
      },
      "name": "Alice Smith",
      "street": "123 Maple Street",
      "city": "Mill Valley",
      "state": "CA",
      "zip": 90952
    },
    "billTo": {
      "$": {
        "country": "US"
      },
      "name": "Robert Smith",
      "street": "8 Oak Avenue",
      "city": "Old Town",
      "state": "PA",
      "zip": 95819
    },
    "comment": "Hurry, my lawn is going wild!",
    "items": {
      "item": [
        {
          "$": {
            "partNum": "872-AA"
          },
          "productName": "Lawnmower",
          "quantity": 1,
          "USPrice": 148.95,
          "comment": "Confirm this is electric"
        },
        {
          "$": {
            "partNum": "926-AA"
          },
          "productName": "Baby Monitor",
          "quantity": 1,
          "USPrice": 39.98,
          "shipDate": "1999-05-21T00:00:00.000Z"
        }
      ]
    }
  }
}
Sign up to request clarification or add additional context in comments.

1 Comment

The example which is attached are not working as expected.github.com/peerlibrary/node-xml4js Any help, would be greatly helpful
1

I had a similar, but opposite, problem with X2JS : it would not create a list if there was only one child element ( even when it should be a list ). The solution ( for me ) was to supply the extra options "arrayFormPaths" to the converter -- this causes matching elements to become an array, even if there is only one element.

I agree that there should be a way to do this using an XML Schema...but I couldn't find anything either.

2 Comments

@Mitar, that looks promising. Any plans to update it? I just tried installing: "found 13 vulnerabilities (1 low, 10 moderate, 2 high)"
Where can I find your referenced X2JS library? It seems distinct from the OP's xml2js, as I can not find your "arrayFormPaths" option there. Is it this one here? github.com/abdolence/x2js
0

Found the following project library on GitHub that does the job : Schema Aware XML to JSON translator .

This library does the job in java. It is still not fully mature but does the conversion using the Schema provided.

Comments

0

There is a very well maintained python package named xmlschema that would do that without any hassle.

Using to_dict method.

import xmlschema
import simplejson as json

my_schema = xmlschema.XMLSchema("../shiporder.xsd")
obj = my_schema.to_dict('../shiporder.xml')
# in case you want to save the json in a file
json.dumps(obj) 

You can get the example xml and xsd for ship order from w3schools. Once downloaded try to leave only one item in the ship order xml and see how the schema retains the array type. The number types are also preserved after conversion to json.

1 Comment

I tried this approach. It seems xmlschema or simplejson have a dependency on Counter which is only available in Python 2.7. However xmlschema won't install on Python 2.7.
-1

If you are looking for a java solution.

Comments

-5

An XML schema is used for XML validation. It is not possible to convert anything with a XML schema. A schema defines the structure of an XML document. If you want to convert something you need a XSLT which is a XML transformation specification. See here how to do it in JavaScript: http://www.w3schools.com/xsl/xsl_client.asp

5 Comments

I want to convert structure of XML document into JSON. And of course it can help. For example, maxOccurs can help you know if you should create an array or not (if maxOccurs is 1, you do not need an array). And type can help you convert to JSON types.
@Mitar You can convert a XML schema to JSON, but this also requires XSLT. A schema does not convert anything.
I am saying that schema could be used to help in the conversion process.
@Mitar Yes a text editor could also help the transformation process.
npmjs.com/package/xml4js "Package builds upon node-xml2js, detects and parses XML Schema which is then used to transform JavaScript object into a consistent schema-driven structure."

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.