1

I have an performance issue with the below code. I want to parse some information from a JSON file to a CSV. The JSON itself has around 200k lines. The performance of this conversion is not good as it takes over 1h to process such a file.

I think the problem might be with the Add-Content function as I'm using a normal HDD for it. Could you please let me know if you see any improvements of the code or any changes that I could do?

$file = "$disk\TEMP\" + $mask
$res = (Get-Content $file) | ConvertFrom-Json
$file = "$disk\TEMP\result.csv"

Write-Host "Creating CSV from JSON" -ForegroundColor Green
Add-Content $file ("{0},{1},{2},{3},{4}" -f "TargetId", "EventType", "UserId", "Username", "TimeStamp")

$l = 0
foreach ($line in $res) {
    if($line.EventType -eq 'DirectDownloadCompleted' -and $line.TargetDefinition -eq 'GOrder') { 
        #nothing here
    } elseif($line.EventType -eq 'DirectDownloadCompleted' -and $line.TargetDefinition -eq 'GFile') {
        Add-Content $file ("{0},{1},{2},{3},{4}" -f
        $line.AssetId, $line.EventType, $line.UserId, $line.UserName, $line.TimeStamp)
        $l = $l + 1
    } else {
        Add-Content $file ("{0},{1},{2},{3},{4}" -f $line.TargetId, $line.EventType, $line.UserId, $line.UserName, $line.TimeStamp)
        $l = $l + 1
    }
}
1
  • If you want significantly better performance, I'd suggest using jq, which is easily installed on Windows. To process a 200,000 line file using jq for the kind of task you describe shouldn't take more than 1s. The jq download page is stedolan.github.io/jq/download; if you have chocolatey, the following should be suficient: choco install jq Commented Nov 18, 2015 at 22:38

1 Answer 1

3

Ok, a few lessons here I think. First off, don't re-write the Export-CSV cmdlet. Instead convert your info into an array of objects, and output it all at once. This will make it so that you only have to write to the file once, which should increase your speed dramatically. Also, don't do ForEach>If>IfElse>Else when this function already exists in the Switch cmdlet. Try something like this:

$Results = Switch($res){
    {$_.EventType -eq 'DirectDownloadCompleted' -and $_.TargetDefinition -eq 'GOrder'}{Continue}
    {$_.EventType -eq 'DirectDownloadCompleted' -and $_.TargetDefinition -eq 'GFile'}{$_ | Select @{l='TargetId';e={$_.AssetId}},EventType,UserId,Username,TimeStamp;Continue}
    Default {$_ | Select TargetId,EventType,UserId,Username,TimeStamp}
}
$Results | Export-CSV $file -NoType
$l = $Results.Count
Sign up to request clarification or add additional context in comments.

1 Comment

Perfect. Many thanks. This script reduced the time to 10 min. I went even further according to the suggestion from @sodawillow and created a separated C# console app which converts the JSON to CSV in 10 sec.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.