Extract substring using regexp in plain bash

Question

I'm trying to extract the time from a string using bash, and I'm having a hard time figuring it out.

My string is like this:

US/Central - 10:26 PM (CST)

And I want to extract the 10:26 part.

Anybody knows of a way of doing this only with bash - without using sed, awk, etc?

Like, in PHP I would use - not the best way, but it works - something like:

preg_match( ""(\d{2}\:\d{2}) PM \(CST\)"", "US/Central - 10:26 PM (CST)", $matches );

Thanks for any help, even if the answer uses sed or awk

Gilles Quénot · Accepted Answer · 2024-12-13 17:55:46Z

314

Using pure bash :

$ cat file.txt
US/Central - 10:26 PM (CST)
$ while read a b time x; do [[ $b == - ]] && echo $time; done < file.txt

with bash regex :

$ [[ "US/Central - 10:26 PM (CST)" =~ -[[:space:]]*([0-9]{2}:[0-9]{2}) ]] &&
    echo ${BASH_REMATCH[1]}

using grep and look-around advanced regex :

$ echo "US/Central - 10:26 PM (CST)" | grep -oP "\-\s+\K\d{2}:\d{2}"

using sed :

$ echo "US/Central - 10:26 PM (CST)" |
    sed 's/.*\- *\([0-9]\{2\}:[0-9]\{2\}\).*/\1/'

using Perl :

$ echo "US/Central - 10:26 PM (CST)" |
    perl -lne 'print $& if /\-\s+\K\d{2}:\d{2}/'

and last one using awk :

$ echo "US/Central - 10:26 PM (CST)" |
    awk '{for (i=0; i<=NF; i++){if ($i == "-"){print $(i+1);exit}}}'

edited Dec 13, 2024 at 17:55

answered Nov 14, 2012 at 4:54

Gilles Quénot

188k43 gold badges232 silver badges229 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

andrux Over a year ago

Cool! Any chance I use also the hyphen "-" in the pattern? because that grep returns some matches, and I'm only interested in the one that has the hyphen and then a space and then the time.....

andrux Over a year ago

I could've probably got the perl solution, but it's an excellent plus. Thanks!

Marco Sulla Over a year ago

Thank you for let me know the \K "trick". grep with perl syntax is really powerful.

CodeBrew Over a year ago

I like the sed version but wanted to warn others that sed doesn't necessarily take + modifier. One way to work around is to use {1, } modifier to match one or more.

jgshawkey · Accepted Answer · 2016-04-29 17:19:22Z

167

    echo "US/Central - 10:26 PM (CST)" | sed -n "s/^.*-\s*\(\S*\).*$/\1/p"

-n      suppress printing
s       substitute
^.*     anything at the beginning
-       up until the dash
\s*     any space characters (any whitespace character)
\(      start capture group
\S*     any non-space characters
\)      end capture group
.*$     anything at the end
\1      substitute 1st capture group for everything on line
p       print it

edited Apr 29, 2016 at 17:19

answered Apr 22, 2016 at 16:14

jgshawkey

2,1221 gold badge11 silver badges9 bronze badges

8 Comments

Noumenon Over a year ago

I feel like this made me an instant sed master. One good option I can tweak is better than nine I don't understand.

studgeek Over a year ago

Thanks for the detailed explanation, helps to avoid future "how do I regexp XXXX" posts.

Victor Zamanian Over a year ago

Could you explain why you first suppress printing with -n then request printing again with /p? Wouldn't it be the same to omit the -n flag and omit the /p directive? Thanks.

tdashroy Over a year ago

@VictorZamanian from here: "By default, sed prints every line. If it makes a substitution, the new text is printed instead of the old one. If you use an optional argument to sed, "sed -n," it will not, by default, print any new lines. ... When the "-n" option is used, the "p" flag will cause the modified line to be printed."

Finesse Over a year ago

It outputs an empty line on macOS both in bash and zsh

|

doubleDown · Accepted Answer · 2012-11-14 08:01:57Z

41

Quick 'n dirty, regex-free, low-robustness chop-chop technique

string="US/Central - 10:26 PM (CST)"
etime="${string% [AP]M*}"
etime="${etime#* - }"

answered Nov 14, 2012 at 8:01

doubleDown

8,4581 gold badge36 silver badges50 bronze badges

3 Comments

Orwellophile Over a year ago

That is so disgustingly dirty that I'm ashamed I didn't think of it myself. +1 | read zone dash time apm zone works too

Victor Zamanian Over a year ago

Very clean, and avoids calls to external programs.

Pedro Over a year ago

Hi, this would be 10x more useful if it included a reference to further documentation or some names around the technique so that people could go off and research more. For the interested, this is bash string manipulation, and you can find more details here: tldp.org/LDP/abs/html/string-manipulation.html

LarsTech · Accepted Answer · 2019-03-26 19:07:38Z

6

If your string is

foo="US/Central - 10:26 PM (CST)"

then

echo "${foo}" | cut -d ' ' -f3

will do the job.

edited Mar 26, 2019 at 19:07

LarsTech

81.9k14 gold badges161 silver badges237 bronze badges

answered Mar 26, 2019 at 19:06

LeChatDeNansen

1331 silver badge6 bronze badges

3 Comments

Markus Over a year ago

or cut -c14-18 of course only as long as the character position isn't changing. which shouldn't happen if the Timezone is fixed.

Aurovrata Over a year ago

doesn't answer the question which specifically asks for a regex based solution

LeChatDeNansen Over a year ago

@Aurovrata : Yes, you're right. So I would suggest : tim=$(print -- "${foo}" | grep -Eo "[[:digit:]]+:[[:digit:]]+") ; # assuming every record has the same format ... but this is not BASH but ksh

Eric Aya · Accepted Answer · 2023-01-19 10:00:44Z

2

No need to open a pipe and spawn sed or awk to extract the 10:26 (time) part. Bash can easily handle this.

input="US/Central - 10:26 PM (CST)"
[[ $input =~ ([0-9]+:[0-9]+) ]]
echo ${BASH_REMATCH[1]}

Outputs:

10:26

If you're using zsh, it's the same, except the match result will be in $match[1] instead of $BASH_REMATCH[1]

In 2023, I don't think the extra pipe to grep, sed, awk or perl are relevant, especially when the question is:

Anybody knows of a way of doing this only with bash - without using sed, awk, etc?

edited Jan 19, 2023 at 10:00

Eric Aya

70.2k36 gold badges190 silver badges266 bronze badges

answered Jan 19, 2023 at 5:03

erwin

8081 gold badge13 silver badges17 bronze badges

Comments

Jimbro · Accepted Answer · 2021-07-19 15:04:48Z

-2

foo="US/Central - 10:26 PM (CST)"

echo ${foo} | date +%H:%M

answered Jul 19, 2021 at 15:04

Jimbro

1

1 Comment

Mikołaj Głodziak Over a year ago

Hello Jimbro, welcome to StackOverflow! Unfortunately this is not the solution to the problem. Note, that OP wants to extract the date from the string and your solution returns the current date.

Collectives™ on Stack Overflow

Extract substring using regexp in plain bash

6 Answers 6

4 Comments

8 Comments

3 Comments

3 Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

4 Comments

8 Comments

3 Comments

3 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related