275

I've been looking for a solution and found similar questions, only they were attempting to split sentences with spaces between them, and the answers do not work for my situation.

Currently a variable is being set to something a string like this:
ABCDE-123456
and I would like to split that into 2 variables, while eliminating the "-". i.e.:
var1=ABCDE
var2=123456

How is it possible to accomplish this?


This is the solution that worked for me:
var1=$(echo $STR | cut -f1 -d-)
var2=$(echo $STR | cut -f2 -d-)

Is it possible to use the cut command that will work without a delimiter (each character gets set as a variable)?

var1=$(echo $STR | cut -f1 -d?)
var2=$(echo $STR | cut -f1 -d?)
var3=$(echo $STR | cut -f1 -d?)
etc.

3
  • For your second question, see @mkb's comment to my answer below - that's definitely the way to go! Commented May 9, 2012 at 19:22
  • 1
    See my edited answer for one way to read individual characters into an array. Commented Jul 4, 2012 at 16:14
  • 1
    Here is the same thing in a more concise form: var1=$(cut -f1 -d- <<<$STR) Commented Dec 31, 2015 at 11:04

5 Answers 5

326

To split a string separated by -, you can use read with IFS:

$ IFS=- read -r var1 var2 <<< ABCDE-123456
$ echo "$var1"
ABCDE
$ echo "$var2"
123456

Edit:

Here is how you can read each individual character into array elements:

$ read -ra foo <<<"$(echo "ABCDE-123456" | sed 's/./& /g')"

Dump the array:

$ declare -p foo
declare -a foo='([0]="A" [1]="B" [2]="C" [3]="D" [4]="E" [5]="-" [6]="1" [7]="2" [8]="3" [9]="4" [10]="5" [11]="6")'

If there are spaces in the string:

$ IFS=$'\v' read -ra foo <<<"$(echo "ABCDE 123456" | sed $'s/./&\v/g')"
$ declare -p foo
declare -a foo='([0]="A" [1]="B" [2]="C" [3]="D" [4]="E" [5]=" " [6]="1" [7]="2" [8]="3" [9]="4" [10]="5" [11]="6")'
Sign up to request clarification or add additional context in comments.

6 Comments

this solution also has the benefit that if delimiter is not present, the var2 will be empty
The read does not work inside loops with input redirects. read will pick a wrong file descriptor to read from.
I initially gave this answer a plus as an elegant solution, but now figured out it works differently on Bash v3 and v4, thereby doesn't work on macos with pre-installed bash v3. Unfortunatelly I can't downvote the answer now since the vote is locked :(
A more general, correct, way: IFS=- read -r -d '' var1 var2 < <(printf %s "ABCDE-123456"). The -r -d '' and <(printf %s ...) are important
@akwky: Use an alternate file descriptor. while read -r line <&3; do ssh_or_something "$line"; done 3<file
|
304

If you know it's going to be just two fields, you can skip the extra subprocesses. Like this:

STR="ABCDE-12345" 
var1=${STR%-*} # ABCDE
var2=${STR#*-} # 12345

Explanation:

  • ${STR%-*} deletes the shortest substring of $STR that matches the pattern -* (deletes - and anything after it). It starts from the end of the string.
  • ${STR#*-} deletes the shortest substring of $STR that matches the pattern *- (deletes - and anything before it). It starts from the beginning of the string.

They each have counterparts %% and ## which find the longest anchored pattern match. To memorize, use this mnemonic, shared by @DS:

"#" is to the left of "%" on a standard keyboard, so "#" removes a prefix (on the left), and "%" removes a suffix (on the right).

See the bash documentation for more information.

9 Comments

Plus 1 For knowing your POSIX shell features, avoiding expensive forks and pipes, and the absence of bashisms.
Dunno about "absence of bashisms" considering that this is already moderately cryptic .... if your delimiter is a newline instead of a hyphen, then it becomes even more cryptic. On the other hand, it works with newlines, so there's that.
I've finally found documentation for it: Shell-Parameter-Expansion
Mnemonic: "#" is to the left of "%" on a standard keyboard, so "#" removes a prefix (on the left), and "%" removes a suffix (on the right).
Another mnemonic, since your keyboard may be different (and some just "feel" the layout, rather than know it): the % symbol is typically encountered after a number, e.g. 90%, hence it is a suffix. The # symbol is typically leading comments or even just the first char in hashtags, so it's a common prefix. The purpose of both modifiers is to remove, one just removes a prefix (#), the other removes the suffix (%).
|
200

If your solution doesn't have to be general, i.e. only needs to work for strings like your example, you could do:

var1=$(echo $STR | cut -f1 -d-)
var2=$(echo $STR | cut -f2 -d-)

I chose cut here because you could simply extend the code for a few more variables...

5 Comments

Can you look at my post again and see if you have a solution for the followup question? thanks!
You can use cut to cut characters too! cut -c1 for example.
Although this is very simple to read and write, is a very slow solution because forces you to read twice the same data ($STR) ... if you care of your script performace, the @anubhava solution is much better
Apart from being an ugly last-resort solution, this has a bug: You should absolutely use double quotes in echo "$STR" unless you specifically want the shell to expand any wildcards in the string as a side effect. See also stackoverflow.com/questions/10067266/…
You're right about double quotes of course, though I did point out this solution wasn't general. However I think your assessment is a bit unfair - for some people this solution may be more readable (and hence extensible etc) than some others, and doesn't completely rely on arcane bash feature that wouldn't translate to other shells. I suspect that's why my solution, though less elegant, continues to get votes periodically...
58

Sounds like a job for set with a custom IFS.

IFS=-
set $STR
var1=$1
var2=$2

(You will want to do this in a function with a local IFS so you don't mess up other parts of your script where you require IFS to be what you expect.)

9 Comments

Nice - I knew about $IFS but hadn't seen how it could be used.
I used triplee's example and it worked exactly as advertised! Just change last two lines to <pre> myvar1=echo $1 && myvar2=echo $2 </pre> if you need to store them throughout a script with several "thrown" variables.
This is a really sweet solution if we need to write something that is not Bash specific. To handle IFS troubles, one can add OLDIFS=$IFS at the beginning before overwriting it, and then add IFS=$OLDIFS just after the set line.
Maybe add a set -f to disable pathname expansion before set -- $STR, or it will capture paths files names if $STR contains patterns.
|
41

Using bash regex capabilities:

re="^([^-]+)-(.*)$"
[[ "ABCDE-123456" =~ $re ]] && var1="${BASH_REMATCH[1]}" && var2="${BASH_REMATCH[2]}"
echo $var1
echo $var2

OUTPUT

ABCDE
123456

1 Comment

Love pre-defining the re for later use(s)!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.