Given the following string:
'<p><a href="china">China</a><br><a href="india">India</a><br><a
href="korea">Korea</a><br><a href="malaysia">Malaysia</a><br><a
href="thailand">Thailand</a></p>'
I'd like to use Powershell to extract all of the countries listed therein. In other words I want to return @(China,India,Korea,Malaysia,Thailand).
Have tried using regex but can't find the right pattern, for example:
'<p><a href="china">China</a><br><a href="india">India</a><br><a href="korea">Korea</a><br><a href="malaysia">Malaysia</a><br><a href="thailand">Thailand</a></p>' -match '(<a href="[A-Z a-z]*">[A-Z a-z]*</a>)+'
$matches
Which returns:
Name Value
---- -----
1 <a href="china">China</a>
0 <a href="china">China</a>
Any suggestions? Is regex the right approach here?
P.S. Note that the snippet is not well-formed so I can't simply convert it to XML.