2

I am looking to make my script a lot faster and giving me an output faster. I am dealing with large sets of data in my csv, and it takes around an hour if I put in the full file to complete with the script.

 $csv = Import-Csv 'U:\Local Group Members.csv' |
Where-Object {($_.Name0 -eq "administrators") -and ($_.caption0 -match "Microsoft Windows 10 Enterprise|Microsoft Windows 7 Enterprise|Microsoft Windows 7 Professional|Microsoft Windows 8 Enterprise|Microsoft Windows 8 Pro|Microsoft Windows 8.1 Enterprise|Microsoft Windows 8.1 Pro")} |
  Select-Object "Netbios_name0", "Name0", "Account0","category0","Domain0","Unique Account Name","Type0","caption0", "Excluded"
#Modify each line
Foreach ($row in $csv) {
If ($row.Type0 -eq 'Domain') { 
    $row."Unique Account Name" = "$($row.Domain0) - $($row.Account0)" 
    If ($row."Unique Account Name" -in @('ACCOUNTS - DODSCAN'.'ACCOUNTS - Domain Admins','ACCOUNTS - LADM_WS_Admins','ACCOUNTS - Tech Enterprise'))
         {$row."Excluded" = "True"}
    Else {$row."Excluded" = "False"}        
}
Else {
    $row."Unique Account Name" = "$($row.Netbios_name0) - $($row.Account0)"
    If ($row."Account0" -in @('esrxadm1n_esi','#Update','medco_tech','medco_admin'))
         {$row."Excluded" = "True"}
    Else {$row."Excluded" = "False"}
}
    Write-Host $row."Unique Account Name"
    Write-Host $row."Excluded"    
}

#Export CSV
 $csv | Export-Csv U:\$(get-date -f yyyy-MM-dd-hh-mm)-TestOutput1.csv -NoTypeInformation 

Can anyone help me make it faster?

10
  • 2
    This would be a better post on CodeReview rather than StackOverflow. Commented Jun 15, 2017 at 15:31
  • Oh okay, thank you @BenH Commented Jun 15, 2017 at 15:32
  • How many rows does this file have? Commented Jun 15, 2017 at 15:32
  • 1
    That isn't all that large. Perhaps the Write-Host commands could write to a file instead. In any event, posting the question on Code Review rather than here does sound like a good suggestion. I don't know enough PowerShell to answer your question. Commented Jun 15, 2017 at 15:38
  • 1
    Even 330k isn't all that much. Nothing that explains 60 minutes of run-time for code which seems to be O(n) in the number of rows. Printing to a console is slow. If the bottleneck isn't all of those Write-Host commands then I have no idea what it is. Hopefully someone at Code Review (where working code is improved) would know. Commented Jun 15, 2017 at 15:46

4 Answers 4

1

Examples of what I was talking about in comments:

$t = New-Object System.Timers.Timer
$t.Start()

$arr1 = @('ACCOUNTS - DODSCAN'.'ACCOUNTS - Domain Admins','ACCOUNTS - LADM_WS_Admins','ACCOUNTS - Tech Enterprise')
$arr2 = @('esrxadm1n_esi','#Update','medco_tech','medco_admin')

$csv = Import-Csv 'U:\Local Group Members.csv' | Where-Object {($_.Name0 -eq "administrators") -and ($_.caption0 -match "Microsoft Windows 10 Enterprise|Microsoft Windows 7 Enterprise|Microsoft Windows 7 Professional|Microsoft Windows 8 Enterprise|Microsoft Windows 8 Pro|Microsoft Windows 8.1 Enterprise|Microsoft Windows 8.1 Pro")} | Select-Object "Netbios_name0", "Name0", "Account0","category0","Domain0","Unique Account Name","Type0","caption0", "Excluded"
Write-Host "Took $($t.Elapsed)ms to load and filter the CSV"

#Modify each line
Foreach ($row in $csv) {
If ($row.Type0 -eq 'Domain') { 
    $row."Unique Account Name" = "$($row.Domain0) - $($row.Account0)" 
    If ($row."Unique Account Name" -in $arr1)
         {$row."Excluded" = "True"}
    Else {$row."Excluded" = "False"}        
}
Else {
    $row."Unique Account Name" = "$($row.Netbios_name0) - $($row.Account0)"
    If ($row."Account0" -in $arr2)
         {$row."Excluded" = "True"}
    Else {$row."Excluded" = "False"}
}
    Write-Host $row."Unique Account Name"
    Write-Host $row."Excluded"    
}

#Export CSV
 $csv | Export-Csv U:\$(get-date -f yyyy-MM-dd-hh-mm)-TestOutput1.csv -NoTypeInformation
Sign up to request clarification or add additional context in comments.

4 Comments

I tried yours and when I ran it, it still took 54 minutes for the whole file. The one I did takes 49
What was the emitted time for the file load?
As in from creation of the file to the end of the script?
As in, I showed you how to create a timer object so you could see how you might be able to time out the different sections of your script. The example I gave was to emit the time it took immediately after you load and filter the CSV. My suspicion is that this will be the majority of the time. You can see it in the "Write-Host" immediately following your $csv assignment.
1

And if you put all in the same line like this?

$tab1 = @('ACCOUNTS - DODSCAN', 'ACCOUNTS - Domain Admins','ACCOUNTS - LADM_WS_Admins','ACCOUNTS - Tech Enterprise')
$tab2 = @('esrxadm1n_esi','#Update','medco_tech','medco_admin')

Import-Csv 'U:\Local Group Members.csv' | Select "administrators", "Netbios_name0", "Name0", "Account0","category0","Domain0","Unique Account Name","Type0","caption0", "Excluded" | %{

 if ($_.Name0 -eq "administrators" -and $_.caption0 -match "Microsoft Windows 10 Enterprise|Microsoft Windows 7 Enterprise|Microsoft Windows 7 Professional|Microsoft Windows 8 Enterprise|Microsoft Windows 8 Pro|Microsoft Windows 8.1 Enterprise|Microsoft Windows 8.1 Pro")
 {
    If ($_.Type0 -eq 'Domain') 
    { 
        $_."Unique Account Name" = "$($_.Domain0) - $($_.Account0)"
        $_."Excluded" = $_."Unique Account Name" -in $tab1   
    }
    Else 
    {
        $_."Unique Account Name" = "$($_.Netbios_name0) - $($_.Account0)"
        $_."Excluded" = $_."Account0" -in $tab2
    }

  $_

}

} | Export-Csv U:\$(get-date -f yyyy-MM-dd-hh-mm)-TestOutput1.csv -NoTypeInformation 

21 Comments

Unfortunately yours did not run when I put it in the powershell.
I added in the Unique Account Name, so it is giving me the following error because it is saying it is not in there.Exception setting "Unique Account Name": "The property 'Unique Account Name' cannot be found on this object. Verify that the property exists and can be set." At line:10 char:55 + $_."Unique Account Name" = "$($_.Domain0) - $($_.Account0)" + ~~~~~~~~~~~ + CategoryInfo : NotSpecified: (:) [], SetValueInvocationException
Ok you have created this property into your select, i dont have seen that. Try my modified code please
I just tried it and it ran but it did not produce out any output into my CSV. It stayed at 0 kb
Sorry, try again
|
0

Instead of having 2 Foreach blocks looping on the same object ($row in $csv), why don't you just combine them? If your CSV is large, why go through it twice?

Foreach ($row in $csv) {
If ($row.Type0 -eq 'Domain') { 
    $row."Unique Account Name" = "$($row.Domain0) - $($row.Account0)" 
    Write-Host $row."Unique Account Name"
} Else { 
    $row."Unique Account Name" = "$($row.Netbios_name0) - $($row.Account0)"
 }
If (($row.Type0 -eq 'Local') -and ($row.Account0 -in @('esrxadm1n_esi','#Update','medco_tech','medco_admin')) -and ($row."Unique Account Name" -in @('ACCOUNTS - DODSCAN'.'ACCOUNTS - Domain Admins','ACCOUNTS - LADM_WS_Admins','ACCOUNTS - Tech Enterprise')))  { 
    $row."Excluded" = "True" 
    Write-Host $row."Excluded"
 } Else { 
    $row."Excluded" = "False" 
 }
}

Also, where you do this match at the top:

($_.caption0 -match "Microsoft Windows 10 Enterprise|Microsoft Windows 7 Enterprise|Microsoft Windows 7 Professional|Microsoft Windows 8 Enterprise|Microsoft Windows 8 Pro|Microsoft Windows 8.1 Enterprise|Microsoft Windows 8.1 Pro") 

Would you be able to reduce it to $_.caption0 -match "Microsoft Windows \d{1,2}" without it picking up rows you don't want, or is that out of the picture?

1 Comment

That is kind of out of the picture since we do not want the Microsoft Windows Servers in there
0

I have a ~600k line CSV of event logs here, and I can test:

import-csv events.csv | select * | foreach { $_ } | Export-csv out.csv

and it takes 5 minutes 15 seconds (~180Mb / on an SSD). With a write-host it pushes up to 9 mins 36 seconds. So it's maybe something about your CSV being HUGE (lots of info per row?), loading it all into memory at once? or the write-host lines.

Try changing it to a streaming approach:

Import-Csv 'U:\Local Group Members.csv' |
    Where-Object {

        ($_.Name0 -eq "administrators") -and
        ($_.caption0 -match "Microsoft Windows (10 Enterprise|7 Enterprise|7 Professional|8 Enterprise|8 Pro|8.1 Enterprise|8.1 Pro)")

    } | ForEach-Object { 

        If ($row.Type0 -eq 'Domain')
        {
            $row."Unique Account Name" = $row.Domain0 + ' - ' + $row.Account0
            $row."Excluded" = ($row."Unique Account Name" -in @('ACCOUNTS - DODSCAN'.'ACCOUNTS - Domain Admins','ACCOUNTS - LADM_WS_Admins','ACCOUNTS - Tech Enterprise'))
        }
        Else
        {
            $row."Unique Account Name" = $row.Netbios_name0 + ' - ' + $row.Account0
            $row."Excluded" = ($row."Account0" -in @('esrxadm1n_esi','#Update','medco_tech','medco_admin'))
        }

    } | Select-Object "Netbios_name0", "Name0", "Account0",
                      "category0","Domain0","Unique Account Name",
                      "Type0","caption0", "Excluded" |
    Export-Csv U:\$(get-date -f yyyy-MM-dd-hh-mm)-TestOutput1.csv -NoTypeInformation

I also shuffled around some of the redundant "if (thing test is true) then (true) else if (thing test is false) then (false)" into just "(thing test)", removed subexpressions, shortened the regex a bit, and moved the select to the end on the guideline 'filter early, select late'.

But making it significantly faster than a PS approach is going to involve dropping the CSV processing from PowerShell down to .Net, or treating it as a text file and doing some filtering that way first, or something.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.