7

Need a help on converting an xml to a specific json structure. XML would look like this,

<DataGrid> 
<DataRow>
    <RowID>1</RowID>
    <Date>26/10/2014</Date>
    <Owner>ABC Company</Owner>        
    <Make>Mitsubishi</Make>
    <Model_Family>Lancer</Model_Family>
    <Submodel>Lancer 2.0 GSR Hatch CVT</Submodel>
    <Origin/>
    <CC_Rating>1997</CC_Rating>
    <Weight>2000</Weight> 
</DataRow> 
<DataRow>
    <RowID>2</RowID>
    <Date>26/10/2014</Date>
    <Owner>ABC Company</Owner>        
    <Make>Mazda</Make>
    <Model_Family>Axela</Model_Family>
    <Submodel/>
    <Origin>Japan</Origin>
    <CC_Rating>1497</CC_Rating>
    <Weight/> 
</DataRow>
 <DataRow>
    <RowID>3</RowID>
    <Date>26/10/2014</Date>
    <Owner>Test Company</Owner>        
    <Make>Kia</Make>
    <Model_Family>Sorento</Model_Family>
    <Submodel/>
    <Origin>Korea</Origin>
    <CC_Rating>2200</CC_Rating>
    <Weight>2500<Weight> 
</DataRow>
<DataRow>
    <RowID>4</RowID>
    <Date>26/10/2014</Date>
    <Owner>Test Company</Owner>        
    <Make>Nissan</Make>
    <Model_Family>Pathfinder</Model_Family>
    <Submodel>SUV<Submodel>
    <Origin>Japan</Origin>
    <CC_Rating>2000</CC_Rating>
    <Weight>2000<Weight> 
</DataRow>

There can be one or more files in above format, my requirement is to read all those files and group them by Owner and call a REST web service by converting those object to JSON. Required JSON format will be as follows.

{
"batches": [
    {
        "Owner": "ABC Company",
        "Date": "2014-10-26",
        "Vehicles": [
            {                    
                "Make": "Mitsubishi",
                "ModelFamily": "Lancer",
                "SubModel": "Lancer 2.0 GSR Hatch CVT",
                "Origin": null
                "CcRating": "1997",
                "Weight": "2000"                    
            },
            {                   
                "Make": "Mazda",
                "ModelFamily": "Axela",
                "SubModel": null,
                "Origin": "Japan",
                "CcRating": "1497",
                "Weight": null                   
            }
        ]
    },
    {
        "Owner": "Test Company",
        "Date": "2014-10-26",
        "Vehicles": [
            {                   
                "Make": "Kia",
                "ModelFamily": "Sorento",
                "SubModel": null,
                "Origin": "Korea",
                "CcRating": "2200",
                "Weight": "2500"                  
            },
            {                    
                "Make": "Nissan",
                "ModelFamily": "Pathfinder",
                "SubModel": "SUV",
                "Origin": "Japan",
                "CcRating": "2000",
                "Weight": "2000"                   
            }
        ]
    }
]

}

This need to be done using windows powershell, Iam a java guy and have no idea how to do it using powershell except calling remote ftp server and read all xml files. Really appreciate, if someone could help me this.

2 Answers 2

24

XML Stuff

Powershell has some stuff for working with XML. First, it has a native [XML] datatype. You can cast strings to XML like so:

$someXml = '...' # pretend there's XML in there!
$xmlObject = [XML]$someXml

The ConvertTo-Xml cmdlet takes a string or other object and converts it to XML, either as a Document (XML object), a string, or a stream (array of strings). You can pass to it a string, or an object:

Reading From a File:

$xmlObject = [XML](Get-Content -Path C:\Path\to\my.xml)
$xmlObject = Get-Content -Path C:\Path\to\my.xml | ConvertTo-Xml

Working with XML Objects

Once you've got your object, you can access nodes as properties:

$xmlObject.SomeNode.SomeChild

You can also use XPATH to select a single node or multiple nodes:

$xmlObject.SelectSingleNode("//this[1]")
$xmlObject.SelectNodes("//these")

Or to do it in a more powershell way you might use the Select-Xml cmdlet:

$xmlObject | Select-Xml "//these"

I'm leaving out a lot of other stuff, especially manipulation, because it seems like you just need to find information and group it together.

JSON Stuff

There isn't a lot to know about JSON in powershell.

Use ConvertFrom-JSON to convert existing JSON into an object, and use ConvertTo-JSON to convert an object into a JSON string.

Hashtables

Sometimes called hashes or associate arrays, as I'm sure you're aware. In Powershell you use them like this:

$hash = @{
    Key1 = 'Value1'
    Key2 = 'Value2'
    Key3 = 10
}

# on one line:
$hash = @{ Key1='Val1';Key2='Val2' }

# adding pairs
$hash['NewKey'] = 'NewVal'
$hash.NewKey = 'NewVal'

# retrieving
$hash['NewKey']
$hash.NewKey

The value can be an array, an object, another hash, etc.

$complex = @{
    ThisThing = @{
        Key1 = 'val1'
        Key2 = 5
    } 
}

REST

Invoke-RestMethod is the easiest way to make REST calls in Powershell (requires version 3.0+).

How to Proceed

Once you're able to parse the information out of the XML, build a nested hash or array of hashes that contains the structure you want, and then convert it to JSON:

$mySpecialHash | ConvertTo-JSON

Take special note of how arrays and hashes are represented in the resulting JSON and maybe change the way you're building them to get the output you want.

If you have specific questions about a particular method or piece of code, then post specific questions on SO about that piece of it.

References

XML

JSON

REST

Sign up to request clarification or add additional context in comments.

3 Comments

This answer indeed helped me to complete my task hence accept as the answer
how is convert to xml distinct from import cli xml?
@NicholasSaunders Import-CliXml is the counterpart to Export-CliXml. CLIXML is a format specific to PowerShell for serializing and deserializing objects. The documentation page says it best: This cmdlet is similar to Export-Clixml except that Export-Clixml stores the resulting XML in a Common Language Infrastructure(CLI) XML file that can be reimported as objects with Import-Clixml. ConvertTo-Xml returns an in-memory representation of an XML document, so you can continue to process it in PowerShell. ConvertTo-Xml does not have an option to convert objects to CLI XML.
1

For anyone still looking for an easy way to convert an entire XML document in 2024 using Powershell, I threw together this quick function that will convert from a System.Xml.XmlDocument to a standard PSObject recursively. From there you can pipe it into ConvertTo-Json or whatever you want.

It's pretty basic so may require some modification for more complex scenarios, but it works for simple stuff like RSS feeds.

function Convert-XmlNodeToObject
{
    [CmdletBinding()]
    param(
        [Parameter(Mandatory = $true, ValueFromPipeline = $true)]
        [ValidateNotNull()]
        [System.Xml.XmlNode]$Node,

        # If a node has no attributes and a single child with a node type of 'Text', then return the innerText value of that child node rather than a separate object
        # This eliminates '#text' child nodes where possible
        [Parameter(Mandatory = $false)]
        [switch]$IncludeChildTextNodes
    )

    $NODE_TYPES = @(
        [System.Xml.XmlNodeType]::Element,
        [System.Xml.XmlNodeType]::Attribute,
        [System.Xml.XmlNodeType]::Text
    )

    if ($Node.NodeType -eq [System.Xml.XmlNodeType]::Text)
    {
        Write-Debug ("Returning child node '{0}' as text because NodeType is '{1}'" -f $Node.Name, $Node.NodeType)
        return $Node.InnerText
    }

    if (!$IncludeChildTextNodes.IsPresent -and $Node.HasChildNodes -and $Node.ChildNodes.Count -eq 1 -and !$Node.HasAttributes)
    {
        $child = $Node.ChildNodes | Select-Object -First 1
        if ($child.NodeType -eq [System.Xml.XmlNodeType]::Text)
        {
            Write-Debug ("Returning child '{0}' node '{1}' as text for parent '{2}' because the parent has no attributes and the child text node is the only node. Pass IncludeChildTextNodes to prevent this behaviour." -f $child.NodeType, $child.Name, $Node.Name)
            return $child.InnerText
        }
    }

    $nodeHt = ([System.Management.Automation.OrderedHashtable]@{})

    foreach ($attribute in $Node.Attributes)
    {
        if ($nodeHt.Contains($attribute.Name))
        {
            Write-Warning ("Skipping attribute '{0}' on node '{1}' because the name is a duplicate" -f $attribute.Name, $Node.Name)
            continue
        }

        $thisObject = Convert-XmlNodeToObject $attribute -IncludeChildTextNodes:$IncludeChildTextNodes.IsPresent
        if ($null -ne $thisObject)
        {
            Write-Debug ("Recursively adding attribute '{0}' for parent node '{1}'" -f $attribute.Name, $Node.Name)
            $nodeHt += @{
                $attribute.Name = $thisObject
            }
        }
        else
        {
            Write-Warning ("Omitting attribute '{0}' for parent '{1}' because the result was null." -f $attribute.Name, $Node.Name)
        }
    }

    # Skip stuff we're not interested in, e.g. declarations
    foreach ($child in $Node.ChildNodes.Where({ $_.NodeType -in $NODE_TYPES }))
    {
        if ($nodeHt.Contains($child.Name))
        {
            # Duplicate node name -- assume an array
            if ($null -ne $nodeHt[$child.Name] -and $nodeHt[$child.Name].GetType().Name.StartsWith('List'))
            {
                Write-Debug ("Duplicate child node name '{0}' of type '{1}' on parent node '{2}' -- using existing collection" -f $child.Name, $child.NodeType, $Node.Name)
                $nodes = $nodeHt[$child.Name]
            }
            else
            {
                Write-Verbose ("Duplicate child node name '{0}' of type '{1}' on parent node '{2}' -- assuming a collection of nodes" -f $child.Name, $child.NodeType, $Node.Name)
                $nodes = New-Object System.Collections.Generic.List[object]
                if ($null -ne $nodeHt[$child.Name])
                {
                    $nodes.Add(($nodeHt[$child.Name]))
                }
            }

            $thisObject = Convert-XmlNodeToObject $child -IncludeChildTextNodes:$IncludeChildTextNodes.IsPresent
            if ($null -ne $thisObject)
            {
                $nodes.Add(($thisObject))
            }
            else
            {
                Write-Warning ("Omitting child '{0}' node '{1}' for parent '{2}' (index {3}) because the result was null." -f $child.NodeType, $child.Name, $Node.Name, $nodes.Count)
            }

            $nodeHt[$child.Name] = $nodes
        }
        else
        {
            $thisObject = Convert-XmlNodeToObject $child -IncludeChildTextNodes:$IncludeChildTextNodes.IsPresent
            if ($null -ne $thisObject)
            {
                Write-Debug ("Recursively adding child node '{0}' of type '{1}' for parent node '{2}'" -f $child.Name, $child.NodeType, $Node.Name)
                $nodeHt += @{
                    $child.Name = $thisObject
                }
            }
            else
            {
                Write-Warning ("Omitting child '{0}' node '{1}' for parent '{2}' because the result was null." -f $child.NodeType, $child.Name, $Node.Name)
            }
        }
    }

    return (New-Object PSObject -Property $nodeHt)
}

It accepts a .NET XmlNode, which XmlDocument implements. If you have raw XML you'll need to create an instance of XmlDocument first:

$xml = [System.Xml.XmlDocument]$someXml
$object = $xml | Convert-XmlNodeToObject

Then to get some JSON:

$object | ConvertTo-Json -Depth 16

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.