I have a set of strings gathered from logs that I'm trying to parse into unique entries:
function Scan ($path, $logPaths, $pattern)
{
$logPaths | % `
{
$file = $_.FullName
Write-Host "`n[$file]"
Get-Content $file | Select-String -Pattern $pattern -CaseSensitive - AllMatches | % `
{
$regexDateTime = New-Object System.Text.RegularExpressions.Regex "((?:\d{4})-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}(,\d{3})?)"
$matchDate = $regexDateTime.match($_)
if($matchDate.success)
{
$loglinedate = [System.DateTime]::ParseExact($matchDate, "yyyy-MM-dd HH:mm:ss,FFF", [System.Globalization.CultureInfo]::InvariantCulture)
if ($loglinedate -gt $laterThan)
{
$date = $($_.toString().TrimStart() -split ']')[0]
$message = $($_.toString().TrimStart() -split ']')[1]
$messageArr += ,$date,$message
}
}
}
$messageArr | sort $message -Unique | foreach { Write-Host -f Green $date$message}
}
}
So for this input:
2015-09-04 07:50:06 [20] WARN Core.Ports.Services.ReferenceDataCheckers.SharedCheckers.DocumentLibraryMustExistService - A DocumentLibrary 3 could not be found.
2015-09-04 07:50:06 [20] WARN Core.Ports.Services.ReferenceDataCheckers.SharedCheckers.DocumentLibraryMustExistService - A DocumentLibrary 3 could not be found.
2015-09-04 07:50:16 [20] WARN Brighter - The message abc123 has been marked as obsolete by the consumer as the entity has a higher version on the consumer side.
Only the second two entries should be returned
I'm having trouble filtering out duplicates of $message: currently all entries are being returned (sort -Unique is not behaving as I would expect it to). I also need the correct $date to be returned against the filtered $message.
I'm pretty stuck with this, can anyone help?