1

Basically I have a huge csv of phishing links and I'm trying to trim off https://www. and anything after .com .edu etc. so basically the ideal ouput of the powershell script would be a long list of urls all of which look something like google.com or microsoft.com so far I have imported the csv but everything I have tried either doesn't work or leaves the www on the beggining. Any help would be great. The csv im using is this: http://data.phishtank.com/data/online-valid.csv

$urls = Import-Csv -Path .\online-valid.csv | select -ExpandProperty "url"
1
  • 1
    run this [URI]'http://www.phishtank.com/phish_detail.php?phish_id=6429209' and you're half there. ;-) Commented Mar 3, 2020 at 2:44

2 Answers 2

1

The below will take your CSV and do magic for you. Have a play around with [Uri], it is very useful when parsing web links.

$csv = import-csv C:\temp\verified_online.csv

Foreach($Site in $csv) {
    $site | Add-Member -MemberType NoteProperty -Name "Host" -Value $(([Uri]$Site.url).Host -replace '^www\.')
}

$csv | Export-Csv C:\temp\verified_online2.csv -NoTypeInformation

Adjusted based on recommendation from Mklement0.

Sign up to request clarification or add additional context in comments.

Comments

1

A concise and fast alternative to Drew's helpful answer based on casting the URL strings directly to an array of [uri] (System.Uri) instances, and then trimming prefix www., if present, from their .Host (server name) property:

([uri[]] (Import-Csv .\online-valid.csv).url).Host -replace '^www\.'

Note that the -replace operator is regex-based, and regex ^www\. makes sure what www is only replaced at the start (^) of the string, and only if followed by a literal . (\.), in which case this prefix is removed (replaced with the implied empty string); if no such prefix is present, the input string is passed through as-is.

The solution reads the entire CSV file into memory at once, for convenience and speed, and outputs just the trimmed server names, as an array of strings.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.