72

I want to split each line of a pipe on spaces, and then print each token on its own line.

I realise that I can get this result using:

(cat someFileInsteadOfAPipe).split(" ")

But I want more flexibility. I want to be able to do just about anything with each token. (I used to use AWK on Unix, and I'm trying to get the same functionality.)

I currently have:

echo "Once upon a time there were three little pigs" | %{$data = $_.split(" "); Write-Output "$($data[0]) and whatever I want to output with it"}

Which, obviously, only prints the first token. Is there a way for me to for-each over the tokens, printing each in turn?

Also, the %{$data = $_.split(" "); Write-Output "$($data[0])"} part I got from a blog, and I really don't understand what I'm doing or how the syntax works.

I want to google for it, but I don't know what to call it. Please help me out with a word or two to Google, or a link explaining to me what the % and all the $ symbols do, as well as the significance of the opening and closing brackets.

I realise I can't actually use (cat someFileInsteadOfAPipe).split(" "), since the file (or preferable incoming pipe) contains more than one line.

Regarding some of the answers:

If you are using Select-String to filter the output before tokenizing, you need to keep in mind that the output of the Select-String command is not a collection of strings, but a collection of MatchInfo objects. To get to the string you want to split, you need to access the Line property of the MatchInfo object, like so:

cat someFile | Select-String "keywordFoo" | %{$_.Line.Split(" ")}
0

4 Answers 4

154
"Once upon a time there were three little pigs".Split(" ") | ForEach {
    "$_ is a token"
 }

The key is $_, which stands for the current variable in the pipeline.

About the code you found online:

% is an alias for ForEach-Object. Anything enclosed inside the brackets is run once for each object it receives. In this case, it's only running once, because you're sending it a single string.

$_.Split(" ") is taking the current variable and splitting it on spaces. The current variable will be whatever is currently being looped over by ForEach.

Sign up to request clarification or add additional context in comments.

3 Comments

aaaah, thanks for the edit. Knowing that % is short for foreach-object means I can do this for multiple lines: cat .\tmp.txt | %{$_.Split(" ")} | %{Write-Output "$($_) hello"} Problem solved.
Perfect! Glad I could help. The last part of your command could actually just be "$_ hello". You only need to use the $($variable) notation if you're trying to expand the value of an object's property inside a string. For example "My last name is $($person.surname)." Or the output of a cmdlet's method: "Tomorrow's date is $((Get-Date).AddDays(1))".
Just a note: As of PowerShell v2 there is a -split operator which can be used to split on whitespace in general (-split $foo) or analogous to .Split(' '): $foo -split ' '.
5

-split outputs an array, and you can save it to a variable like this:

$a = -split 'Once  upon    a     time'
$a[0]

Once

Another cute thing, you can have arrays on both sides of an assignment statement:

$a,$b,$c = -split 'Once  upon    a'
$c

a

Comments

5

To complement Justus Grunow's helpful answer:

  • As Joey notes in a comment, PowerShell has a powerful, regex-based -split operator.

    • In its unary form (-split '...'), -split behaves like awk's default field splitting, which means that:
      • Leading and trailing whitespace is ignored.
      • Any run of whitespace (e.g., multiple adjacent spaces) is treated as a single separator.
  • In PowerShell v4+ an expression-based - and therefore faster - alternative to the ForEach-Object cmdlet became available: the intrinsic .ForEach() method, (alongside the .Where() method, a more powerful, expression-based alternative to Where-Object).

Here's a solution based on these features:

PS> (-split '   One      for the money   ').ForEach({ "token: [$_]" })
token: [One]
token: [for]
token: [the]
token: [money]

Note that the leading and trailing whitespace was ignored, and that the multiple spaces between One and for were treated as a single separator.

Comments

2

Another way to accomplish this is a combination of Justus Thane's and mklement0's answers. It doesn't make sense to do it this way when you look at a one liner example, but when you're trying to mass-edit a file or a bunch of filenames it comes in pretty handy:

$test = '   One      for the money   '
$option = [System.StringSplitOptions]::RemoveEmptyEntries
$($test.split(' ',$option)).foreach{$_}

This will come out as:

One
for
the
money

1 Comment

I keep finding that I get the wrong number when using a plain text file with one line containing one computer name (hostname) and one blank line. $counterTotal = $($computers.Split(" ").count) gives me exactly what I want. Thanks for the inspiration @s31064

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.