2

I am trying to extra multiple points of data (First, Last, ID number) from a rather nasty log file.

I have this:

Get-Content c:\LOG\22JAN01.log | Out-String | 
  % {[Regex]::Matches($_, "(?<=FIRST:)((.|\n)*?)(?=LAST:)")} | % {$_.Value}

Which does a fine job of extracting the first name - but I need to also get the last name and ID number from the same line and present them together "BOB SMITH 123456"

Each line of the log file looks like this:

FIRST:BOB LAST:SMITH DOOR:MAIN ENTRANCE ID:123456 TIME:Friday, December 31, 2021 11:55:47 PM INCIDENT:19002304

I would like the output to look something like:

  • BOB SMITH 123456
  • JACK JONES 029506
  • KAREN KARPENTER 6890298

So far I can only manage to get all the first names and nothing else. Thanks for any help pointing me in the right direction!

1
  • The log file does look as literally as what we see in the quoted text? Name, Last and door on the same line? Commented Jan 1, 2022 at 18:22

5 Answers 5

6

If they are always on the same line, I like to use switch to read it.

switch -Regex -File c:\LOG\22JAN01.log {
    'FIRST:(\w+) LAST:(.+) DOOR.+ ID:(\d+) ' {
        [PSCustomObject]@{
            First = $matches[1]
            Last  = $matches[2]
            ID    = $matches[3]
        }
    }
}

Sample log output

First Last      ID     
----- ----      --     
BOB   SMITH     123456 
JACK  JONES     029506 
KAREN KARPENTER 6890298

You can capture it to a variable and then continue using the objects however you like.

$output = switch -Regex -File c:\LOG\22JAN01.log {
    'FIRST:(\w+) LAST:(.+) DOOR.+ ID:(\d+) ' {
        [PSCustomObject]@{
            First = $matches[1]
            Last  = $matches[2]
            ID    = $matches[3]
        }
    }
}

$output | Out-GridView

$output | Export-Csv -Path c:\Log\parsed_log.log -NoTypeInformation
Sign up to request clarification or add additional context in comments.

Comments

2

You need to use capture groups ().

Assuming that FIRST is always right at the start of the line (remove the ^ if not), and that the field names are always present and in the same order, and that their values are at least one character long, you could use, for example:

$result = & {
  $path = "c:\LOG\22JAN01.log";
  $pattern = "^FIRST:(.+?) LAST:(.+?) DOOR:.+? ID:(\d+)";
  Select-String -Path $path -Pattern $pattern -AllMatches |
  % {$_.Matches.Groups[1], $_.Matches.Groups[2], $_.Matches.Groups[3] -join " "}
}

.+? means match one or more of any character except newlines, as few times as possible before what follows in the pattern can be matched. Something more restrictive such as [A-Z]+ can be used instead if that will definitely match the required values.

Comments

2

If you can make the assumption that each field name is composed of (English) letters only,[1] such as FIRST, a generic solution that combines the -replace operator with the ConvertFrom-StringData cmdlet is possible:

# Sample array of input lines.
$inputLines = 
  'FIRST:BOB LAST:SMITH DOOR:MAIN ENTRANCE ID:123456 TIME:Friday, December 31, 2021 11:55:47 PM INCIDENT:19002304',
  'FIRST:JACK LAST:JONES DOOR:SIDE ENTRANCE ID:123457 TIME:Friday, December 31, 2021 11:55:48 PM INCIDENT:19002305',
  'FIRST:KAREN LAST:KARPENTER DOOR:BACK ENTRANCE ID:123458 TIME:Friday, December 31, 2021 11:55:49 PM INCIDENT:19002306'

$inputLines -replace '\b([a-z]+):', "`n`$1=" | 
  ConvertFrom-StringData |
    ForEach-Object { $_.FIRST, $_.LAST, $_.ID -join ' ' }
  • For each input line, the -replace operation places each field name-value pair onto its own line, replacing the separator, :, with =.

  • The resulting block of lines is parsed by ConvertFrom-StringData into a hashtable representing the fields of each input line, allowing convenient access to the fields by name, e.g. .FIRST (PowerShell allows you to use property-access syntax as an alternative to index syntax, s.g. ['FIRST']).

Output:

BOB SMITH 123456
JACK JONES 123457
KAREN KARPENTER 123458

[1] More generally, you can use this approach as long as you can formulate a regex that unambiguously identifies a field name.

Comments

1

Using this reusable function:
(See also: #16257 String >>>Regex>>> PSCustomObject)

function ConvertFrom-Text {
    [CmdletBinding()]Param (
        [Regex]$Pattern,
        [Parameter(Mandatory = $True, ValueFromPipeLine = $True)]$InputObject
    )
    process {
        if ($_ -match $pattern) {
            $matches.Remove(0)
            [PSCustomObject]$matches
        }
    }
}

$log = @(
    'FIRST:BOB LAST:SMITH DOOR:MAIN ENTRANCE ID:123456 TIME:Friday, December 31, 2021 11:55:47 PM INCIDENT:19002304'
    'FIRST:JOHN LAST:DOE DOOR:MAIN ENTRANCE ID:789101 TIME:Friday, December 31, 2021 11:55:47 PM INCIDENT:19002304'
)

$Log |ConvertFrom-Text -Pattern '\bFIRST:(?<First>\S*).*\bLAST:(?<Last>\S*).*\bID:(?<ID>\d+)'

ID     Last  First
--     ----  -----
123456 SMITH BOB
789101 DOE   JOHN

1 Comment

A neat function, but if you're using capture groups there's no need for lookbehinds: (?<=FIRST:) would be better as just FIRST: etc.
1

Assuming the log file looks literally as what we see in the quoted text you could match it like this:

$log = @'
FIRST:BOB LAST:SMITH DOOR:MAIN ENTRANCE ID:123456 TIME:Friday, December 31, 2021 11:55:47 PM INCIDENT:19002304
FIRST:JOHN LAST:DOE DOOR:MAIN ENTRANCE ID:789101 TIME:Friday, December 31, 2021 11:55:47 PM INCIDENT:19002304
'@

$re = [regex]'(?si)FIRST:(?<first>.*?)\s*LAST:(?<last>.*?)\s*DOOR.*?ID:(?<id>\d+)'

foreach($match in $re.Matches($log))
{
    '{0} {1} {2}' -f
        $match.Groups['first'].Value,
        $match.Groups['last'].Value,
        $match.Groups['id'].Value
}

# Results in:
BOB SMITH 123456
JOHN DOE 789101

This regex should work on a multi-line string so you would use -Raw for Get-Content:

$re = [regex]'(?si)FIRST:(?<first>.*?)\s*LAST:(?<last>.*?)\s*DOOR.*?ID:(?<id>\d+)'

$result = foreach($match in $re.Matches((Get-Content ./test.log -Raw)))
{
    [pscustomobject]@{
        First = $match.Groups['first'].Value
        Last  = $match.Groups['last'].Value
        ID    = $match.Groups['id'].Value
    }
}

$result | Export-Csv path/to/newlog.csv -NoTypeInformation

See https://regex101.com/r/WluWpD/1 for the regex explanation.

12 Comments

The value of DOOR in the example is MAIN ENTRANCE, so ENTRANCE should not be in your regex as other doors may not include it. Also, the .*? at the end is pointless. Looks good otherwise.
@MikeM I'm dumb, thank you! for both things (.*? too). Learning regex is hard hehe
So far this seems to work! At least it displays what I need. Now just to figure out how to write it to a file and I'll be all set I think!
@DerekB see my last edit, you just need to collect the results of the foreach loop in a variable ($result) and then Out-File the results.
Ah yes I was missing the one $result = line - now it exports to a file - thank you!
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.