1

I am trying to write a script that downloads web sites information. I am able to download the information but I cannot seem to get the filtering working. I have an a series of values that I want skipped stored in $TakeOut but it does not recognize the values in the if -eq $TakeOut. I have to write a line for each value.

What I am wondering is, if there is a way to use a $value as over time there will be a considerable amount of values to skip.

This works but is not practical in the long run.

if ($R.innerText -eq "Home") {Continue}

Something like this would be preferable.

if ($R.innerText -eq $TakeOut) {Continue}

Here is a sample of my code.

#List of values to skip
$TakeOut = @()
$TakeOut = (
"Help",
"Home",
"News",
"Sports",
"Terms of use",
"Travel",
"Video",
"Weather"
)

#Retrieve website information
$Results = ((Invoke-WebRequest -Uri "https://www.msn.com/en-ca/").Links)

#Filter and format to new table of values
$objects = @()
foreach($R in $Results) {
   if ($R.innerText -eq $TakeOut) {Continue}
   $objects += New-Object -Type PSObject -Prop @{'InnerText'= $R.InnerText;'href'=$R.href;'Title'=$R.href.split('/')[4]}
}

#output to file
$objects  | ConvertTo-HTML -As Table -Fragment | Out-String >> $list_F
0

1 Answer 1

1

You cannot meaningfully use an array as the RHS of an -eq operation (the array will be implicitly stringified, which won't work as intended).

PowerShell has operators -contains and -in to test membership of a value in an array (using -eq on a per-element basis - see this answer for background); therefore:

 if ($R.innerText -in $TakeOut) {Continue}

Generally, your code can be streamlined (PSv3+ syntax):

$TakeOut = 
    "Help",
    "Home",
    "News",
    "Sports",
    "Terms of use",
    "Travel",
    "Video",
    "Weather"

#Retrieve website information
$Results = (Invoke-WebRequest -Uri "https://www.msn.com/en-ca/").Links

#Filter and format to new table of values
$objects = foreach($R in $Results) {
   if ($R.innerText -in $TakeOut) {Continue}
   [pscustomobject @{
      InnerText = $R.InnerText
      href = $R.href
      Title = $R.href.split('/')[4]
   }
}

#output to file
$objects | ConvertTo-HTML -As Table -Fragment >> $list_F
  • Note the absence of @(...), which is never needed for array literals.

  • Building an array in a loop with += is slow (and verbose); simply use the foreach statement as an expression, which returns the loop body's outputs as an array.

  • [pscustomobject] @{ ... } is PSv3+ syntactic sugar for constructing custom objects; in addition to being faster than a New-Object call, it has the added advantage of preserving property order.

You could write the whole thing as a single pipeline:

#Retrieve website information
(Invoke-WebRequest -Uri "https://www.msn.com/en-ca/").Links | ForEach-Object {
   #Filter and format to new table of values
   if ($_.innerText -in $TakeOut) {return}
   [pscustomobject @{
      InnerText = $_.InnerText
      href = $_.href
      Title = $_.href.split('/')[4]
   }
} | ConvertTo-HTML -As Table -Fragment >> $list_F

Note the need to use return instead of continue to move on to the next input.

Sign up to request clarification or add additional context in comments.

1 Comment

Glad to hear it, @Woody.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.