1

first of all, I've got a reliable search (thanks to some help on Stack Overflow) that checks for occurrences of different strings in a line over many log files.

I've now been tasked to include multiple searches and since there are about 20 files and about a dozen search criteria, I don't want to to have to access these files over 200 times. I believe the best way of doing this is in a array, but so far all methods I've tried have failed.

The search criteria is made up of date, which obviously changes very day, a fixed string (ERROR) and a unique java classname. Here is what i have:

        $dateStr = Get-Date -Format "yyyy-MM-dd"
        $errword = 'ERROR'
        $word01 = [regex]::Escape('java.util.exception')   
    
        $pattern01 = "${dateStr}.+${errword}.+${word01}"
    
        $count01 = (Get-ChildItem -Filter $logdir -Recurse | Select-String -Pattern $pattern01 -AllMatches |ForEach-Object Matches |Measure-Object).Count
        Add-Content $outfile  "$dateStr,$word01,$count01"

the easy way to expand this is to have a separate three command entry (set word, set pattern and then search) for each class i want to search against - which I've done and it works, but its not elegant and then we're processing >200 files to run the search. I've tried to read the java classes in from a simple text file with mixed results, but its the only thing I've been able to get to work in order to simplify the search for 12 different patterns.

2
  • 1
    have you tried generating all your combined patterns from your input sources, building a regex OR, and using that in your S-S call? ///// also, the -Path parameter of S-S will read lines in far faster than using G-C and a pipeline stage. [grin] Commented Sep 7, 2021 at 13:29
  • 1
    The -Pattern of Select-String supports a string array. Try this: 'One Two Three' |Select-String -Pattern 'One', 'Three', and this: 'Two Three Four' |Select-String -Pattern 'One', 'Three' (Either of the two search patterns matches both input lines.) In other words, you can just do: ... |Select-String -Pattern $pattern01, $pattern02, $pattern03 (which means: select the string that matches $pattern01 or $pattern02 or $pattern03. Commented Sep 7, 2021 at 13:38

1 Answer 1

1

iRon provided an important pointer: Select-String can accept an array of patterns to search for, and reports matches for lines that match any one of them.

You can then get away with a single Select-String call, combined with a Group-Object call that allows you to group all matching lines by which pattern matched:

# Create the input file with class names to search for.
@'
java.util.exception
java.util.exception2
'@ > classNames.txt

# Construct the array of search patterns,
# and add them to a map (hashtable) that maps each
# pattern to the original class name.
$dateStr = Get-Date -Format 'yyyy-MM-dd'
$patternMap = [ordered] @{}
Get-Content classNames.txt | ForEach-Object {
  $patternMap[('{0}.+{1}.+{2}' -f $dateStr, 'ERROR', [regex]::Escape($_))] = $_
}

# Search across all files, using multiple patterns.
Get-ChildItem -File -Recurse $logdir | Select-String @($patternMap.Keys) |
  # Group matches by the matching pattern.
  Group-Object Pattern |
    # Output the result; send to `Set-Content` as needed.
    ForEach-Object { '{0},{1},{2}' -f $dateStr, $patternMap[$_.Name], $_.Count }

Note:

  • $logDir, as the name suggests, is presumed to refer to a directory in which to (recursively) search for log files; passing that to -Filter wouldn't work, so I've removed it (which then positionally binds $logDir to the -Path parameter); -File limits the results to files; if other types of files are also present, add a -Filter argument as needed, e.g. -Filter *.log

  • Select-String's -AllMatches switch is generally not required - you only need it if any of the patterns can match multiple times per line and you want to capture all of those matches.

  • Using @(...), the array-subexpression operator around the collection of the hashtable's keys, $patternMap.Keys, i.e. the search patterns, is required purely for technical reasons: it forces the collection to be convertible to an array of strings ([string[]]), which is how the -Pattern parameter is typed.

    • The need for @(...) is surprising, and may be indicative of a bug, as of PowerShell 7.2; see GitHub issue #16061.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.