I'm having a block solving this. I want to get all the URL's in the text that match my pattern. Should include the first parm of the URL, but not the second one.
Two issues:
- It's not getting the first URL
- I'm missing how the capture works.
In Method 1, I see the matches, but I don't see the capture text of what I put in parentheses. In Method 2, I see my captures on some outputs, but getting extra outputs that contain more than my capture. I like Method 2 style, but did Method 1 to try to understand what's happening, but just dug my self a deeper hole.
$fileContents = 'Misc Text < a href="http://example.com/Test.aspx?u=a1">blah blah</a> More Stuff <a href="http://example.com/Test.aspx?u=b2&parm=123">blah blah </a> Closing Text'
#Sample URL http://example.com/Test.aspx?u=a1&parm=123
$pattern = '<a href="(http://example.com/Test.aspx\?u=.*?)[&"]'
Write-Host "RegEx Pattern=$pattern"
Write-Host "----------- Method 1 --------------"
$groups = [regex]::Matches($fileContents, $pattern)
$groupnum = 0
foreach ($group in $groups)
{
Write-Host "Group=$groupnum URL=$group "
$capturenum = 0
foreach ($capture in $group.Captures)
{
Write-Host "Group=$groupnum Capture=$capturenum URL=$capture.value index=$($capture.index)"
$capturenum = $capturenum + 1
}
$groupnum = $groupnum + 1
}
Write-Host "----------- Method 2 --------------"
$urls = [regex]::Matches($fileContents, $pattern).Groups.Captures.Value
#$urls = $urls | select -Unique
Write-Host "Number of Matches = $($urls.Count)"
foreach ($url in $urls)
{
Write-Host "URL: $url "
}
Write-Host " "
Output:
----------- Method 1 --------------
Group=0 URL=<a href="http://example.com/Test.aspx?u=b2&
Group=0 Capture=0 URL=<a href="http://example.com/Test.aspx?u=b2&.value index=81
----------- Method 2 --------------
Number of Matches = 2
URL: <a href="http://example.com/Test.aspx?u=b2&
URL: http://example.com/Test.aspx?u=b2
Powershell Version 5.1.17763.592
Select-String -Pattern '(?<=a href=")[^"]*' -AllMatches<anda.Select-String.