9

I want to fill up a dynamic array with the same integer value as fast as possible using Powershell.
The Measure-Command shows that it takes 7 seconds on my system to fill it up.
My current code (snipped) looks like:

$myArray = @()
$length = 16385
for ($i=1;$i -le $length; $i++) {$myArray += 2}  

(Full code can be seen on gist.github.com or on superuser)

Consider that $length can change. But for better understanding I chose a fixed length.

Q: How do I speed up this Powershell code?

5 Answers 5

24

You can repeat arrays, just as you can do with strings:

$myArray = ,2 * $length

This means »Take the array with the single element 2 and repeat it $length times, yielding a new array.«.

Note that you cannot really use this to create multidimensional arrays because the following:

$some2darray = ,(,2 * 1000) * 1000

will just create 1000 references to the inner array, making them useless for manipulation. In that case you can use a hybrid strategy. I have used

$some2darray = 1..1000 | ForEach-Object { ,(,2 * 1000) }

in the past, but below performance measurements suggest that

$some2darray = foreach ($i in 1..1000) { ,(,2 * 1000) }

would be a much faster way.


Some performance measurements:

Command                                                  Average Time (ms)
-------                                                  -----------------
$a = ,2 * $length                                                 0,135902 # my own
[int[]]$a = [System.Linq.Enumerable]::Repeat(2, $length)           7,15362 # JPBlanc
$a = foreach ($i in 1..$length) { 2 }                             14,54417
[int[]]$a = -split "2 " * $length                                24,867394
$a = for ($i = 0; $i -lt $length; $i++) { 2 }                    45,771122 # Ansgar
$a = 1..$length | %{ 2 }                                         431,70304 # JPBlanc
$a = @(); for ($i = 0; $i -lt $length; $i++) { $a += 2 }       10425,79214 # original code

Taken by running each variant 50 times through Measure-Command, each with the same value for $length, and averaging the results.

Position 3 and 4 are a bit of a surprise, actually. Apparently it's much better to foreach over a range instead of using a normal for loop.


Code to generate above chart:

$length = 16384

$tests = '$a = ,2 * $length',
         '[int[]]$a = [System.Linq.Enumerable]::Repeat(2, $length)',
         '$a = for ($i = 0; $i -lt $length; $i++) { 2 }',
         '$a = foreach ($i in 1..$length) { 2 }',
         '$a = 1..$length | %{ 2 }',
         '$a = @(); for ($i = 0; $i -lt $length; $i++) { $a += 2 }',
         '[int[]]$a = -split "2 " * $length'

$tests | ForEach-Object {
    $cmd = $_
    $timings = 1..50 | ForEach-Object {
        Remove-Variable i,a -ErrorAction Ignore
        [GC]::Collect()
        Measure-Command { Invoke-Expression $cmd }
    }
    [pscustomobject]@{
        Command = $cmd
        'Average Time (ms)' = ($timings | Measure-Object -Average TotalMilliseconds).Average
    }
} | Sort-Object Ave* | Format-Table -AutoSize -Wrap
Sign up to request clarification or add additional context in comments.

1 Comment

+1 Concise, clear, constructive, comprehensive, and reproducible! (well, 4 out of 5 c's...)
8

Avoid appending to an array in a loop. It's copying the existing array to a new array with each iteration. Do this instead:

$MyArray = for ($i=1; $i -le $length; $i++) { 2 }

1 Comment

+1 $MyArray = for ($i=1; $i -le 16385; $i++) { 2 } runs in 0,05 seconds. much faster than my 7s :)
5

Using PowerShell 3.0 you can use (need .NET Framework 3.5 or upper):

[int[]]$MyArray = ([System.Linq.Enumerable]::Repeat(2, 65000))

Using PowerShell 2.0

$AnArray = 1..65000 | % {2}

1 Comment

+1 [int[]]$myArray = ([System.Linq.Enumerable]::Repeat(2, 16385)) runs in 0,03s
1

It is not clear what you are trying. I tried looking at your code. But, $myArray +=2 means you are just adding 2 as the element. For example, here is the output from my test code:

$myArray = @()
$length = 4
for ($i=1;$i -le $length; $i++) {
    Write-Host $myArray
    $myArray += 2
}

2
2 2
2 2 2

Why do you need to add 2 as the array element so many times?

If all you want is just fill the same value, try this:

$myArray = 1..$length | % { 2 }

6 Comments

He is just filling the array with some value ? the value is '2'
Question says he wants to fill the array with the same integer value. His problem is that appending to the array with += is terribly slow.
Hmm! I understood that. But why? Why even find a better way to do something that is not needed. Anyway, he can use range operator as well.
I already appended the full code as github link just to avoid discussions about Why. If you look at the link, you will see that my powershell executes an Excel command for querying a CSV. And the parameter TextFileColumnDataTypes for that query needs an Array to know what data types the columns should be. A 2 stands for a string column, 1 for general, 9 to skip the entire column and so on. So: Long story short: I need a big array with the integer value 2.
+1 $myArray = 1..16385 | % { 2 } runs in 0,02 seconds. much faster than my 7s :)
|
-1

If you need it really fast, then go with ArrayLists and Tuples:

$myArray = New-Object 'Collections.ArrayList'
$myArray = foreach($i in 1..$length) {
    [tuple]::create(2)
}

and if you need to sort it later then use this (normally a bit slower):

$myArray = New-Object 'Collections.ArrayList'
foreach($i in 1..$length) {
    $myArray.add(
        [tuple]::create(2)
    )
}

both versions are in the 20ms range for me ;-)

3 Comments

Though this is faster than the code in the question, what is the purpose of using a 1-value Tuple? That means you have to access the Item1 property to get the value back, plus you're creating an Object to wrap every Int32, which will be a lot of garbage on larger lists. This doesn't hurt so bad because using the obsolete, non-generic ArrayList class means it'd be boxing each Int32 in an Object, anyways. Rewriting as $myArray = New-Object 'Collections.Generic.List[Int32]'; foreach($i in 1..$length) { $myArray.add(2) } I get a 40% speedup and with less characters/complexity, too.
Also, each property of a Tuple is read-only, so if you want to change a list value (which is bound to happen because...what good is a list of repeated values that always stay the same?) your only option is to create a new Tuple to replace it.
even if the tuple-part is not really needed for the above challenge, it is worth to remember this to fill large read-only arrays with multiple columns/items per object. very handy to sort very large arrays/lookup-tables by different columns. without tuples and the need to sort anything.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.