1

I have a nested XML, that I need to convert to CSV using Powershell. Unfortunately, I am more at a beginner's level and was not able to solve this problem with the existing threads, I found online.

I tried it with reading the XML file into Powershell and creating a new object, but my export to csv doesn't even contain that unsufficient result... :(

The XML file I have looks like this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<Data source="Jhonny" datetime="2019-04-23T10:07:50+02:00" timezone="Europe">
    <dealerships>
        <location name="Germany">
            <series parameter="Sold Cars" unit="car">
                <value datetime="2019-04-22T00:00:00+02:00" value="7.3"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="7.8"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="7.0"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="6.0"/>
            </series>
            <series parameter="Sold Cars" unit="Auto">
                <value datetime="2019-04-22T00:00:00+02:00" value="4.0"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="4.0"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="4.0"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="4.0"/>
            </series>
        </location>
        <location name="USA">
            <series parameter="Sold Cars" unit="car">
                <value datetime="2019-04-22T00:00:00+02:00" value="5.1"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="4.1"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="3.6"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="3.1"/>
            </series>
            <series parameter="Sold Cars" unit="Auto">
                <value datetime="2019-04-22T00:00:00+02:00" value="3.0"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="3.0"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="3.0"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="3.0"/>
            </series>
        </location>
    </dealerships>
</Data>

The result I am aiming for would look like this:

Location;Date/Time;Sold Cars car;Sold Cars Auto
Germany; 2019-04-22T00:00:00+02:00; 7.3;4.0
Germany; 2019-04-22T00:00:00+02:00; 7.8;5.0
Germany; 2019-04-22T00:00:00+02:00; 7.0;3.0
Germany; 2019-04-22T00:00:00+02:00; 6.0;4.0
USA; 2019-04-22T00:00:00+02:00; 5.1;3.0
USA; 2019-04-22T00:00:00+02:00; 4.1;6.0
USA; 2019-04-22T00:00:00+02:00; 3.6;1.0
USA; 2019-04-22T00:00:00+02:00; 3.1;8.0

As I haven't really gotten anywhere, I don't think my code helps, but here is how I tried to solve it, but failed:

$xml = "C:\Users\[me]\Convert_XML_to_CSV\cars.xml"
$obj = New-Object System.XML.XMLDocument
$obj.Load("$xml")

foreach ($i in $_.Data.dealerships.location) {
    $o = New-Object Object
    Add-Member -InputObject $o -MemberType NoteProperty -Name location -Value $obj.Data.dealerships.Location $i $o
} | Export-Csv "result.csv" -Delimiter "," -NoType -Encoding UTF8
2
  • Which version of PowerShell are you targeting? Does it need to work in PowerShell 2.0 (Windows 7 default version)? Commented May 15, 2019 at 12:17
  • Hi Mathias, unfortunately yes, I am using W7. Will upgrade in a couple of months, but thus far, just Powershell 2.0. br BananaJoe Commented May 15, 2019 at 13:04

2 Answers 2

2

Perhaps not exactly what your desired output shows, but this may help.

Note: I'm using a here string for the xml. In your case, load it from file using

[xml]$xml = Get-Content "C:\Users\[me]\Convert_XML_to_CSV\cars.xml"

The code:

[xml]$xml = @'
<?xml version="1.0" encoding="ISO-8859-1"?>
<Data source="Jhonny" datetime="2019-04-23T10:07:50+02:00" timezone="Europe">
    <dealerships>
        <location name="Germany">
            <series parameter="Sold Cars" unit="car">
                <value datetime="2019-04-22T00:00:00+02:00" value="7.3"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="7.8"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="7.0"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="6.0"/>
            </series>
            <series parameter="Sold Cars" unit="Auto">
                <value datetime="2019-04-22T00:00:00+02:00" value="4.0"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="4.0"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="4.0"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="4.0"/>
            </series>
        </location>
        <location name="USA">
            <series parameter="Sold Cars" unit="car">
                <value datetime="2019-04-22T00:00:00+02:00" value="5.1"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="4.1"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="3.6"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="3.1"/>
            </series>
            <series parameter="Sold Cars" unit="Auto">
                <value datetime="2019-04-22T00:00:00+02:00" value="3.0"/>
                <value datetime="2019-04-22T01:00:00+02:00" value="3.0"/>
                <value datetime="2019-04-22T02:00:00+02:00" value="3.0"/>
                <value datetime="2019-04-22T03:00:00+02:00" value="3.0"/>
            </series>
        </location>
    </dealerships>
</Data>
'@ 

$result = foreach ($item in $xml.Data.dealerships.location) {
    $location = $item.Name

    # get the different column names
    $units = $item.series | ForEach-Object { '{0} {1}' -f $_.parameter, $_.unit}

    # loop through the series
    foreach ($series in $item.series) {
        # and the values
        foreach ($value in $series.value) {
            # since you are using PowerShell 2.0, create the output object like this
            $objOut = New-Object -TypeName PSObject
            $objOut | Add-Member -MemberType NoteProperty -Name 'Location' -Value $location
            $objOut | Add-Member -MemberType NoteProperty -Name 'DateTime' -Value $value.datetime

            $thisUnit = '{0} {1}' -f $series.parameter, $series.unit
            # add the different units as property.
            foreach ($unit in $units) { 
                $val = if ($unit -eq $thisUnit) { $value.value } else { '' }
                $objOut | Add-Member -MemberType NoteProperty -Name $unit -Value $val 
            }

            # output the object
            $objOut
        }
    }
}

# output on screen
$result | Format-Table -AutoSize
# output to CSV file
$result | Export-Csv -Path 'D:\test.csv' -Encoding UTF8 -NoTypeInformation

Result:

Location DateTime                  Sold Cars car Sold Cars Auto
-------- --------                  ------------- --------------
Germany  2019-04-22T00:00:00+02:00 7.3                         
Germany  2019-04-22T01:00:00+02:00 7.8                         
Germany  2019-04-22T02:00:00+02:00 7.0                         
Germany  2019-04-22T03:00:00+02:00 6.0                         
Germany  2019-04-22T00:00:00+02:00               4.0           
Germany  2019-04-22T01:00:00+02:00               4.0           
Germany  2019-04-22T02:00:00+02:00               4.0           
Germany  2019-04-22T03:00:00+02:00               4.0           
USA      2019-04-22T00:00:00+02:00 5.1                         
USA      2019-04-22T01:00:00+02:00 4.1                         
USA      2019-04-22T02:00:00+02:00 3.6                         
USA      2019-04-22T03:00:00+02:00 3.1                         
USA      2019-04-22T00:00:00+02:00               3.0           
USA      2019-04-22T01:00:00+02:00               3.0           
USA      2019-04-22T02:00:00+02:00               3.0           
USA      2019-04-22T03:00:00+02:00               3.0


Update

As requested in your comment, you can further combine/group the $result array from the code above like this:

$combined = $result | Group-Object -Property DateTime, Location | ForEach-Object {
    foreach ($location in ($_.Group | Group-Object Location)) {
        # create an output object and put in the Location property here
        $objOut = New-Object -TypeName PSObject
        $objOut | Add-Member -MemberType NoteProperty -Name 'Location' -Value ($location.Name)
        foreach ($date in ($location.Group | Group-Object DateTime)) {
            # add the DateTime property
            $objOut | Add-Member -MemberType NoteProperty -Name 'DateTime' -Value ($date.Name)
            foreach ($unit in $_.Group) {
                # join the other two properties to the $objOut object:
                # I do not want to hard-code the property names here, 
                # so use Select-Object to get the remaining props.
                $sold = $unit | Select-Object * -ExcludeProperty Location, DateTime
                foreach ($thing in $sold.psobject.properties | Where-Object { ($_.Value) }) {
                    # if you want the numbers as floating-point numbers, do this:
                    # $objOut | Add-Member -MemberType NoteProperty -Name $($thing.Name) -Value ([double]$thing.Value)
                    # like below, these values will be output as string
                    $objOut | Add-Member -MemberType NoteProperty -Name $($thing.Name) -Value ($thing.Value)
                }
            }
        }
        $objOut
    }
}

# output on screen
$combined | Format-Table -AutoSize
# output to CSV file
$combined | Export-Csv -Path 'D:\test_Grouped.csv' -Encoding UTF8 -NoTypeInformation

This will result in:

Location DateTime                  Sold Cars car Sold Cars Auto
-------- --------                  ------------- --------------
Germany  2019-04-22T00:00:00+02:00 7.3           4.0           
Germany  2019-04-22T01:00:00+02:00 7.8           4.0           
Germany  2019-04-22T02:00:00+02:00 7.0           4.0           
Germany  2019-04-22T03:00:00+02:00 6.0           4.0           
USA      2019-04-22T00:00:00+02:00 5.1           3.0           
USA      2019-04-22T01:00:00+02:00 4.1           3.0           
USA      2019-04-22T02:00:00+02:00 3.6           3.0           
USA      2019-04-22T03:00:00+02:00 3.1           3.0
Sign up to request clarification or add additional context in comments.

5 Comments

This is a neat approach here : $units = $item.series | ForEach-Object { '{0} {1}' -f $_.parameter, $_.unit}
Hi Theo, thank you so much! I can recreate your result. I believe this should do the job. Have a good one!
Theo, is there a way, I can put the Cars and Autos in the same line, so I save the doubling of Location and DateTime?
@BananaJoe I have updated my answer to combine the results into a more condensed output. Hope you like it.
Works like a charm. Thank you @Theo, that is amazing! I am trying to understand the solution :)
2

This one was a bit tricky. I dealt with it by parsing the XML using PowerShell's native parsing capabilities, then stepping through the nodes by .location giving us a list broken up by location (So we'd have one for USA, one for Germany, etc)

Within the first loop, we have two series for each location, one with a unit of car and one with a unit of Autos. So next we find the series with a unit of car to get all of the cars sold. Then we foreach our way through those.

Within the most deeply nested loop, cars, we find a matching record from the Auto series, matching by the datetime.

This gives us all of the properties we need to make a PSCustomObject in PowerShell 2.0 format. I tested and the desired output looks to be right in line with what you were looking for.

$dealerships = ([xml]$x).Data.dealerships.location

foreach ($location in $dealerships){
    $cars = $location.series | Where-Object {$_.unit -eq 'car'}
    foreach ($car in $cars.value){
        $auto = $location.series | Where-Object {$_.unit -eq 'Auto'} | Select-Object -ExpandProperty value | Where-Object {$_.datetime -eq $car.datetime}

        $ObjectProperties = @{
            Location = $location.name
            DateTime = $car.datetime
            SoldCars = $car.value
            SoldAutos= $auto.value
        }
        New-Object PSObject -Property $ObjectProperties
    }
}

3 Comments

Hey, just saw your solution now. Will try this tomorrow as well. Thank you for your help in any case. From a first glance this seems to give me what I was looking for, so keen on trying it tomorrow. Good day!
Hi FoxDeploy, I have tested your solution now and found one problem with it, which you couldn't have encountered with my test data. You match the Autos and Cars via the datetime. This means, that if there is an Auto value, for which no car value exists, no result will come to the output for that line. Can I circumvent this?
FoxDeploy, I will pursue Theos solution. But many thanks for your approach! Something I can learn from :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.