1

This is a follow up to the question I previously asked about deriving a totally flat structure out of an XML node: Converting an xml doc into a specific dot-expanded json structure.

Suppose I have the same XML to start with:

<Item ID="288917">
  <Main>
    <Platform>iTunes</Platform>
    <PlatformID>353736518</PlatformID>
  </Main>
  <Genres>
    <Genre FacebookID="6003161475030">Comedy</Genre>
    <Genre FacebookID="6003172932634">TV-Show</Genre>
  </Genres>
  <Products>
    <Product Country="CA">
      <URL>https://itunes.apple.com/ca/tv-season/id353187108?i=353736518</URL>
      <Offers>
        <Offer Type="HDBUY">
          <Price>3.49</Price>
          <Currency>CAD</Currency>
        </Offer>
        <Offer Type="SDBUY">
          <Price>2.49</Price>
          <Currency>CAD</Currency>
        </Offer>
      </Offers>
    </Product>
    <Product Country="FR">
      <URL>https://itunes.apple.com/fr/tv-season/id353187108?i=353736518</URL>
      <Rating>Tout public</Rating>
      <Offers>
        <Offer Type="HDBUY">
          <Price>2.49</Price>
          <Currency>EUR</Currency>
        </Offer>
        <Offer Type="SDBUY">
          <Price>1.99</Price>
          <Currency>EUR</Currency>
        </Offer>
      </Offers>
    </Product>
  </Products>
</Item>

Now I would like to convert it into a nested json object in a specific format (slightly different than the xmltodict library. Here is the structure I'd like to derive:

{
    "Item[@ID]": 288917,
    "Item.Main.Platform": "iTunes",
    "Item.Main.PlatformID": "353736518",
    "Item.Genres": [
        {
            "[@FacebookID]": "6003161475030",
            "Value": "Comedy"
        },
        {
            "[@FacebookID]": "6003161475030",
            "Value": "TV-Show"
        }
    ],
    "Item.Products": [
        {
            "[@Country]": "CA",
            "URL": "https://itunes.apple.com/ca/tv-season/id353187108?i=353736518",
            "Offers.Offer": [
                {
                    "[@Type]": "HDBUY",
                    "Price": "3.49",
                    "Currency": "CAD"
                }
                {
                    "[@Type]": "SDBUY",
                    "Price": "2.49",
                    "Currency": "CAD"
                }
            ]
        },
        {
            "[@Country]": "FR",
            "URL": "https://itunes.apple.com/fr/tv-season/id353187108?i=353736518",
            "Offers.Offer": [
                {
                    "[@Type]": "HDBUY",
                    "Price": "3.49",
                    "Currency": "EUR"
                }
                {
                    "[@Type]": "SDBUY",
                    "Price": "1.99",
                    "Currency": "EUR"
                }
            ]
        }
    ]
}

The main difference being instead of collapsing everything into a list of flat values, to allow lists of dicts. How could this be done?

1
  • You can do this using XSLT, even version 1.0. Commented Dec 31, 2018 at 19:47

1 Answer 1

3

While doing the above might be a nice challenge, xmltodic already does a great job with this and can do the job with slight altering.

And here are the changes to make in xmltodict:

  1. Change var cdata_key from #text to Value.
  2. Change var attr_prefix from @ to [@.
  3. Add new var attr_suffix=']' to init method.
  4. Change attr_key to key = self.attr_prefix+self._build_name(key)+self.attr_suffix.

That should give the exact result you're looking for with a tested module:

>>> from lxml import etree
>>> import xmltodict
>>> import json
>>> from utils import xmltodict
>>> node= etree.fromstring(s)
>>> d=xmltodict.parse(etree.tostring(node))
>>> print(json.dumps(d, indent=4))
{
    "Item": {
        "[@ID]": "288917",
        "Main": {
            "Platform": "iTunes",
            "PlatformID": "353736518"
        },
        "Genres": {
            "Genre": [
                {
                    "[@FacebookID]": "6003161475030",
                    "Value": "Comedy"
                },
                {
                    "[@FacebookID]": "6003172932634",
                    "Value": "TV-Show"
                }
            ]
        },
        "Products": {
            "Product": [
                {
                    "[@Country]": "CA",
                    "URL": "https://itunes.apple.com/ca/tv-season/id353187108?i=353736518",
                    "Offers": {
                        "Offer": [
                            {
                                "[@Type]": "HDBUY",
                                "Price": "3.49",
                                "Currency": "CAD"
                            },
                            {
                                "[@Type]": "SDBUY",
                                "Price": "2.49",
                                "Currency": "CAD"
                            }
                        ]
                    }
                },
                {
                    "[@Country]": "FR",
                    "URL": "https://itunes.apple.com/fr/tv-season/id353187108?i=353736518",
                    "Rating": "Tout public",
                    "Offers": {
                        "Offer": [
                            {
                                "[@Type]": "HDBUY",
                                "Price": "2.49",
                                "Currency": "EUR"
                            },
                            {
                                "[@Type]": "SDBUY",
                                "Price": "1.99",
                                "Currency": "EUR"
                            }
                        ]
                    }
                }
            ]
        }
    }
}
Sign up to request clarification or add additional context in comments.

1 Comment

I think this is the easiest way to do this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.