0

I have an annoying report output (curse be to HP) that is not something I can shape by a query - it's all or nothing more or less. I would like to take two lines from each "chunk" of output and construct an array from this. I figured it would be a simple split() operation but no such luck. A sample of the output is as so:

Medium identifier : 1800010a:54bceddd:1d8c:0007

Medium label             : [ARJ170L6] ARJ170L6
Location                 : [TapeLibrary:    24]
Medium Owner             : wfukut01
Status                   : Poor
Blocks used  [KB]        : 2827596544
Blocks total [KB]        : 2827596544
Usable space [KB]        : 1024
Number of writes         : 16
Number of overwrites     : 4
Number of errors         : 0
Medium initialized       : 19 January 2015, 11:43:32
Last write               : 26 April 2016, 21:02:12
Last access              : 26 April 2016, 21:02:12
Last overwrite           : 24 April 2016, 04:48:55
Protected                : Permanent
Write-protected          : No


Medium identifier : 1800010a:550aa81e:3a0c:0006

Medium label             : [ARJ214L6] ARJ214L6
Location                 : External
Medium Owner             : wfukut01
Status                   : Poor
Blocks used  [KB]        : 2904963584
Blocks total [KB]        : 2904963584
Usable space [KB]        : 0
Number of writes         : 9
Number of overwrites     : 7
Number of errors         : 0
Medium initialized       : 19 March 2015, 10:42:45
Last write               : 30 April 2016, 22:14:19
Last access              : 30 April 2016, 22:14:19
Last overwrite           : 29 April 2016, 13:41:35
Protected                : Permanent
Write-protected          : No

What would be ideal is if the final output of this work would create an array somewhat similar to this:

Location                     UsableSpace
---------                    -------------
External                     0
TapeLibrary                  1024

So I can (for example) query the output so that I can do operations on the data within the array:

$myvar | where-object { $_.Location -eq "TapeLibrary" }

Perhaps there are better approaches? I would be more than happy to hear them!

1
  • 3
    What command is generating that output? Commented May 15, 2018 at 16:15

5 Answers 5

2

If the command is not a Powershell cmdlet like Kolob Canyon's answer then you would need to parse the text. Here's an inelegant example using -match and regex to find the lines with Location and Usable space [KB] and find the word characters after the colon.

((Get-Content C:\Example.txt -Raw) -split 'Medium identifier') | ForEach-Object {
    [void]($_ -match 'Location\s+:\s(.*?(\w+).*)\r')
    $Location = @($Matches.values | Where-Object {$_ -notmatch '\W'})[0]
    [void]($_ -match 'Usable\sspace\s\[KB\]\s+:\s(.*?(\w+).*)\r')
    $UsableSpace = @($Matches.values | Where-Object {$_ -notmatch '\W'})[0]
    if ($Location -or $UsableSpace){
        [PSCustomObject]@{
            Location = $Location
            UsableSpace = $UsableSpace
        }
    }
}

As this is extremely fragile and inelegant, it's much better to interact with an object where ever possible.

Sign up to request clarification or add additional context in comments.

3 Comments

I was also wondering (since the output is key:*value* pairs) if he could just surround the output in {} and run ConvertFrom-Json to convert it to an object
@KolobCanyon That's a great idea. But I think there would be a lot of escaping that would have to be done. e.g. Medium label = > 'Medium Label'
Yeah, that's what I was thinking too
2

Assuming the data is as regular as it looks, you could use multiple assignment to extract the data from the array as in:

$data = 1, 2, "ignore me", 3, 10, 22, "ignore", 30
$first, $second, $null, $third, $data = $data

where the first, second and fourth array elements go into the variables, "ignore me" gets discarded in $null and the remaining data goes back into data. In your case, this would look like:

# Read the file into an array
$data = Get-Content data.txt

# Utility to fix up the data row
function FixUp ($s)
{
    ($s -split ' : ')[1].Trim()
}

# Loop until all of the data is processed
while ($data)
{
    # Extract the current record using multiple assignment
    # $null is used to eat the blank lines
    $identifier,$null,$label,$location,$owner,$status,
        $used,$total,$space,$writes,$overwrites,
        $errors, $initialized, $lastwrite, $lastaccess,
        $lastOverwrite, $protected, $writeprotected,
        $null, $null, $data = $data

    # Convert it into a custom object 
    [PSCustomObject] [ordered] @{
        Identifier   = fixup $identifier
        Label        = fixup $label
        location     = fixup $location
        Owner        = fixup $owner
        Status       = fixup $status
        Used         = fixup $used
        Total        = fixup $total
        Space        = fixup $space
        Write        = fixup $writes
        OverWrites   = fixup $overwrites
        Errors       = fixup $errors
        Initialized  = fixup $initialized
        LastWrite    = [datetime] (fixup $lastwrite)
        LastAccess   = [datetime] (fixup $lastaccess)
        LastOverWrite = [datetime] (fixup $lastOverwrite)
        Protected    = fixup $protected
        WriteProtected = fixup $writeprotected
    }
}

Once you have the data extracted, you can format it any way you want

1 Comment

Like that very much +1. All datetimes get properly casted. If you put the whole while into a new variable and group/measure -sum space that var you get the desired end result.
2

That looks a very regular pattern, so I'd say there are three typical approaches to this.

First: your own bucket-fill-trigger-empty parser, load lines in until you reach the next trigger ("Medium identifier"), then empty out the bucket to the pipeline and start a new one.

Something like:

$bucket = @{}

foreach ($line in Get-Content -LiteralPath C:\path\data.txt)
{
    # if full, empty bucket to pipeline
    if ($line -match '^Medium identifier')
    {
        [PSCustomObject]$bucket
        $bucket = @{}
    }

    # add line to bucket (unless it's blank)
    if (-not [string]::IsNullOrWhiteSpace($line))
    {
        $left, $right = $line.Split(':', 2)
        $bucket[$left.Trim()] = $right.Trim()
    }
}

# empty last item to pipeline
[PSCustomObject]$bucket

Adjust to taste for identifying numbers, dates, etc.

Second: a multiline regex: I tried, but can't. It would look something like:

# Not working, but for example:
$r = @'
Medium identifier    : (?<MediumIdentifier>.*)
\s*
Write-protected      : (?<WriteProtected>.*)
Blocks used  [KB]    : (?<BlockesUsed>.*)
Medium label         : (?<MediumLabel>.*)
Last write           : (?<LastWrite>.*)
Medium Owner         : (?<MediumOwner>.*)
Usable space [KB]    : (?<UsableSpaceKB>.*)
Number of overwrites : (?<NumberOfOverwrites>.*)
Last overwrite       : (?<LastOverwrite>.*)
Medium identifier    : (?<MediumIdentifier>.*)
Blocks total [KB]    : (?<BlocksTotalKB>.*)
Number of errors     : (?<NumberOfErrors>.*)
Medium initialized   : (?<MediumInitialized>.*)
Status               : (?<Status>.*)
Location             : (?<Location>.*)
Protected            : (?<Protected>.*)
Number of writes     : (?<NumberOfWrites>.*)
Last access          : (?<LastAccess>.*)
\s*
'@

[regex]::Matches((get-content C:\work\a.txt -Raw), $r, 
    [System.Text.RegularExpressions.RegexOptions]::IgnoreCase + 
    [System.Text.RegularExpressions.RegexOptions]::Singleline
    )

Third: ConvertFrom-String - http://www.lazywinadmin.com/2014/09/powershell-convertfrom-string-and.html or https://blogs.technet.microsoft.com/ashleymcglone/2016/09/14/use-the-new-powershell-cmdlet-convertfrom-string-to-parse-klist-kerberos-ticket-output/ then after you made the template

Get-Content data.txt | ConvertFrom-String -TemplateFile .\template.txt

Comments

2

The easiest way is to use the command itself and select certain properties. If the command is a powershell cmdlet, it should return an Object.

$output = Some-HPCommand | select 'Medium label', 'Location' 

Then you can access specific properties:

$output.'Medium label'
$output.Location

If you can provide the exact command, I can write this more accurately.

The biggest issue when people are learning powershell is they treat output like a String... Everything in PowerShell is object-oriented, and once you begin to think in terms of Objects, it becomes much easier to process data; in other words, always try to handle output as objects or arrays of objects. It will make your life a hell of a lot easier.

1 Comment

I wish that were the case. This is absolutely not a PS cmdlet. Just a big ole bunch of strings.
1

If each section is in the same format, i.e. the Usable space section is always 5 lines down from the location then you can use the Select-String in combination with the context parameter. Something like this:

Select-String .\your_file.txt -Pattern '(?<=Location\s*:\s).*' -Context 0, 5 | % {
    New-Object psobject -Property @{
        Location = (($_.Matches[0] -replace '\[|\]', '') -split ':')[0]
        UsableSpace = ($_.Context.PostContext[4] -replace '^\D+(\d+)$', '$1' )
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.