0

I am very new to powershell I have a code a co-worker helped me build. It works on a small set of data. However, I am sending this to a SAP business objects query and that will only accept about 2000 pieces of data. Each month the amount of data I have to run will vary but is usually around 7000-8000 items. I need help to update my script to run through the list of data create an array, add 2000 items to it and then create a new array with the next 2000 items, etc until it reaches the end of the list.

$source = "{0}\{1}" -f $ENV:UserProfile, "Documents\Test\DataSD.xls"

$WorkbookSource = $Excel.Workbooks.Open("$source")
$WorkSheetSource = $WorkbookSource.WorkSheets.Item(1)
$WorkSheetSource.Activate()
$row = [int]2
$docArray = @()
$docArray.Clear() |Out-Null

    Do
    {
        $worksheetSource.cells.item($row, 1).select() | Out-Null
        $docArray += @($worksheetSource.cells.item($row, 1).value())

        $row++
    }
    While ($worksheetSource.cells.item($row,1).value() -ne $null)

So for this example I would need the script to create 4 separate arrays. The first 3 would have 2000 items in them and the last would have 1200 items in it.

3 Answers 3

1

for this to work, you will need to export the data to a CSV or otherwise extract it to a collection that holds all the items. using something like the StreamReader stuff would probably allow for faster processing, but i have never worked with it. [blush]

once the $CurBatch is generated, you can feed that into whatever process you want.

$InboundCollection = 1..100
$ProcessLimit = 22
# the "- 1" is to correct for "starts at zero"
$ProcessLimit = $ProcessLimit - 1

$BatchCount = [math]::Floor($InboundCollection.Count / $ProcessLimit)

#$End = 0
foreach ($BC_Item in 0..$BatchCount)
    {
    if ($BC_Item -eq 0)
        {
        $Start = 0
        }
        else
        {
        $Start = $End + 1
        }

    $End = $Start + $ProcessLimit
    # powershell will happily slice past the end of an array
    $CurBatch = $InboundCollection[$Start..$End]

    ''
    $Start
    $End
    # the 1st item is not the _number in $Start_
    #    it's the number in the array @ "[$Start]"
    "$CurBatch"

    }

output ...

0
21
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

22
43
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44

44
65
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

66
87
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88

88
109
89 90 91 92 93 94 95 96 97 98 99 100
Sign up to request clarification or add additional context in comments.

Comments

0

To do this, there are a number of options. You can read in everything from the Excel file in one large array and split that afterwards in smaller chunks or you can add the Excel file values in separate arrays while reading.
The code below does just that.

In any case, it is up to you when you would like to actually send the data.

  1. process each array immediately (send it to a SAP business objects query) while reading from Excel
  2. add it to a Hashtable so you keep all arrays together in memory
  3. store it on disk for later use

In the code below, I choose the second option to read in the data in a number of arrays and keep these in memory in a hashTable.
The advantage is that you do not need to interrupt the reading of the Excel data like with option 1. and there is no need to create and re-read 'in-between' files as with option 3.

$source = Join-Path -Path $ENV:UserProfile -ChildPath "Documents\Test\DataSD.xls"
$maxArraySize = 2000

$Excel = New-Object -ComObject Excel.Application
# It would speed up things considerably if you set $Excel.Visible = $false
$WorkBook = $Excel.Workbooks.Open($source)
$WorkSheet = $WorkBook.WorkSheets.Item(1)
$WorkSheet.Activate()

# Create a Hashtable object to store each array under its own key
# I don't know if you need to keep the order of things later, 
# but it maybe best to use an '[ordered]' hash here.
# If you are using PowerShell version below 3.0. you need to create it using
# $hash = New-Object System.Collections.Specialized.OrderedDictionary
$hash = [ordered]@{}

# Create an ArrayList for better performance
$list = New-Object System.Collections.ArrayList
# Initiate a counter to use as Key in the Hashtable
$arrayCount = 0
# and maybe a counter for the total number of items to process?
$totalCount = 0

# Start reading the Excel data. Begin at row $row
$row = 2
do {
    $list.Clear()
    # Add the values of column 1 to the arraylist, but keep track of the maximum size
    while ($WorkSheet.Cells.Item($row, 1).Value() -ne $null -and $list.Count -lt $maxArraySize) {
        [void]$list.Add($WorkSheet.Cells.Item($row, 1).Value())
        $row++
    }
    if ($list.Count) {
        # Store this array in the Hashtable using the $arrayCount as Key. 
        $hash.Add($arrayCount.ToString(), $list.ToArray())
        # Increment the $arrayCount variable for the next iteration
        $arrayCount++
        # Update the total items counter
        $totalCount += $list.Count
    }
} while ($list.Count)

# You're done reading Excel data, so close it and release Com objects from memory
$Excel.Close()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($WorkSheet) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($WorkBook) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Excel) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()

# At this point you should have all arrays stored in the hash to process
Write-Host "Processing $($hash.Count) arrays with a total of $totalCount items"

foreach ($key in $hash.Keys) {
    # Send each array to a SAP business objects query separately
    # The array itself is at $hash.$key or use $hash[$key]
}

3 Comments

You're script worked perfect. It created the arrays in the hash table just as I needed. I forgot 1 small detail. In order to pass the variables into SAP business objects I need the array to store them as item followed by semi-colon, i.e. 123456; 246579. Would it be better to format the arrays after they have been built or as they are being built? Either way, any suggestion as to how to code it? Thank you.
@SHartman thanks. It is really easy to convert the arrays to string. I would advise to do that at 'sending-time'. Just do a $hash.$key -join ';' and there's your string item to send.
Thank you so much. Everything worked and the output is what I expected. I needed to finish this project before the end of the year and with your help I'll be able to meet my deadline.
0

This is not 100% but i will fine tune it a bit later today:

$docarray = @{}
$values = @()

$i = 0
$y = 0

for ($x = 0; $x -le 100; $x++) {
    if ($i -eq 20) {
        $docarray.add($y, $values)
        $y++
        $i=0
        $values = @()
    }

    $values += $x
    $i++
}
$docarray.add($y, $values) ## required

$docarray | Format-List

If the limit is 2000 then you would set the if call to trigger at 2000. The results of this will be a hash table of x amount:

Name  : 4
Value : {80, 81, 82, 83...}

Name  : 3
Value : {60, 61, 62, 63...}

Name  : 2
Value : {40, 41, 42, 43...}

Name  : 1
Value : {20, 21, 22, 23...}

Name  : 0
Value : {0, 1, 2, 3...}

Whereby each name in the hash array has x amount of values represented by the $i iterator on the if statement.

You should then be able to send this to your SAP business objects query by using a foreach loop with the values for each item in the hash array:

foreach ($item in $docarray) {
    $item.Values
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.