5

I have been trying to extract certain values from multiple lines inside a .txt file with PowerShell.

Host
Class
INCLUDE vmware:/?filter=Displayname Equal "server01" OR Displayname Equal "server02" OR Displayname Equal "server03 test"

This is what I want :

server01
server02
server03 test

I have code so far :

$Regex = [Regex]::new("(?<=Equal)(.*)(?=OR")           
$Match = $Regex.Match($String)
1

4 Answers 4

4

You may use

[regex]::matches($String, '(?<=Equal\s*")[^"]+')

See the regex demo.

See more ways to extract multiple matches here. However, you main problem is the regex pattern. The (?<=Equal\s*")[^"]+ pattern matches:

  • (?<=Equal\s*") - a location preceded with Equal and 0+ whitespaces and then a "
  • [^"]+ - consumes 1+ chars other than double quotation mark.

Demo:

$String = "Host`nClass`nINCLUDE vmware:/?filter=Displayname Equal ""server01"" OR Displayname Equal ""server02"" OR Displayname Equal ""server03 test"""
[regex]::matches($String, '(?<=Equal\s*")[^"]+') | Foreach {$_.Value}

Output:

server01
server02
server03 test

Here is a full snippet reading the file in, getting all matches and saving to file:

$newfile = 'file.txt'
$file = 'newtext.txt'
$regex = '(?<=Equal\s*")[^"]+'
Get-Content $file | 
     Select-String $regex -AllMatches | 
     Select-Object -Expand Matches | 
     ForEach-Object { $_.Value } |
     Set-Content $newfile
Sign up to request clarification or add additional context in comments.

4 Comments

FYI, if the matches may span across multiple lines, replace Get-Content $file | with Get-Content $file | Out-String |, or, if you're using PowerShell v3 or newer, Get-Content $file -Raw |
if there are multiple INCLUDE vmware:/?filterWhat happened ? I have opened new question. stackoverflow.com/questions/54646160/…
@Arbelac Could you please explain the issue?
@Arbelac Try $regex.matches( $String.Split("`r`n").where({$_.contains('"')})[0] ).groups.where{$_.name -eq 1}.value | sc "c:\temp\result.txt" or, change the line selecting condition to a safer one, .where({$_ -match '"[^"]+"'})
2

You can modify your regex to use a capture group, which is indicated by the parentheses. The backslashes just escape the quotes. This allows you to just capture what you are looking for and then filter it further. The capture group here is automatically named 1 since I didn't provide a name. Capture group 0 is the entire match including quotes. I switched to the Matches method because that encompasses all matches for the string whereas Match only captures the first match.

$regex = [regex]'\"(.*?)\"'    
$regex.matches($string).groups.where{$_.name -eq 1}.value

If you want to export the results, you can do the following:

$regex = [regex]'\"(.*?)\"'    
$regex.matches($string).groups.where{$_.name -eq 1}.value | sc "c:\temp\export.txt"

1 Comment

+1 for a nice generalization. Quibbles: No need to \ -escape ". Syntax .where{...} definitely works, but to me the more verbose form .where({...}) is preferable for conceptual reasons, so that no one is tempted to use .where {...} (note the space), which breaks. As an aside: alias sc for Set-Content was, for better or worse, removed from PowerShell Core.
2

Another option (PSv3+), combining [regex]::Matches() with the -replace operator for a concise solution:

$str = @'
Host
Class
INCLUDE vmware:/?filter=Displayname Equal "server01" OR Displayname Equal "server02" OR Displayname Equal "server03 test"
'@ 

[regex]::Matches($str, '".*?"').Value -replace '"'

Regex ".*?" matches all "..."-enclosed tokens; .Value extracts them, and -replace '"' strips the " chars.

It may be not be obvious, but this happens to be the fastest solution among the answers here, based on my tests - see bottom.


As an aside: The above would be even more PowerShell-idiomatic if the -match operator - which only looks for a (one) match - had a variant named, say, -matchall, so that one could write:

# WISHFUL THINKING (as of PowerShell Core 6.2)
$str -matchall '".*?"' -replace '"'

See this feature suggestion on GitHub.


Optional reading: performance comparison

Pragmatically speaking, all solutions here are helpful and may be fast enough, but there may be situations where performance must be optimized.

Generally, using Select-String (and the pipeline in general) comes with a performance penalty - while offering elegance and memory-efficient streaming processing.

Also, repeated invocation of script blocks (e.g., { $_.Value }) tends to be slow - especially in a pipeline with ForEach-Object or Where-Object, but also - to a lesser degree - with the .ForEach() and .Where() collection methods (PSv4+).

In the realm of regexes, you pay a performance penalty for variable-length look-behind expressions (e.g. (?<=EQUAL\s*")) and the use of capture groups (e.g., (.*?)).

Here is a performance comparison using the Time-Command function, averaging 1000 runs:

Time-Command -Count 1e3 { [regex]::Matches($str, '".*?"').Value -replace '"' },
   { [regex]::matches($String, '(?<=Equal\s*")[^"]+') | Foreach {$_.Value} },
   { [regex]::Matches($str, '\"(.*?)\"').Groups.Where({$_.name -eq '1'}).Value },
   { $str | Select-String -Pattern '(?<=Equal\s*")[^"]+' -AllMatches | ForEach-Object{$_.Matches.Value} } |
     Format-Table Factor, Command

Sample timings from my MacBook Pro; the exact times aren't important (you can remove the Format-Table call to see them), but the relative performance is reflected in the Factor column, from fastest to slowest.

Factor Command
------ -------
1.00   [regex]::Matches($str, '".*?"').Value -replace '"' # this answer
2.85   [regex]::Matches($str, '\"(.*?)\"').Groups.Where({$_.name -eq '1'}).Value # AdminOfThings'
6.07   [regex]::matches($String, '(?<=Equal\s*")[^"]+') | Foreach {$_.Value} # Wiktor's
8.35   $str | Select-String -Pattern '(?<=Equal\s*")[^"]+' -AllMatches | ForEach-Object{$_.Matches.Value} # LotPings'

Comments

1

An alterative reading the file directly with Select-String using Wiktor's good RegEx:

Select-String -Path .\file.txt -Pattern '(?<=Equal\s*")[^"]+' -AllMatches|
    ForEach-Object{$_.Matches.Value} | Set-Content NewFile.txt

Sample output:

> Get-Content .\NewFile.txt
server01
server02
server03 test

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.