Regular expression to match string starting with a specific word

Question

How do I create a regular expression to match a word at the beginning of a string?

We are looking to match stop at the beginning of a string and anything can follow it.

For example, the expression should match:

stop
stop random
stopping

Peter Mortensen · Accepted Answer · 2021-11-13 21:38:30Z

356

If you wish to match only lines beginning with stop, use

^stop

If you wish to match lines beginning with the word stop followed by a space:

^stop\s

Or, if you wish to match lines beginning with the word stop, but followed by either a space or any other non-word character you can use (your regex flavor permitting)

^stop\W

On the other hand, what follows matches a word at the beginning of a string on most regex flavors (in these flavors \w matches the opposite of \W)

^\w

If your flavor does not have the \w shortcut, you can use

^[a-zA-Z0-9]+

Be wary that this second idiom will only match letters and numbers, no symbol whatsoever.

Check your regex flavor manual to know what shortcuts are allowed and what exactly do they match (and how do they deal with Unicode).

edited Nov 13, 2021 at 21:38

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Aug 6, 2009 at 18:06

Vinko Vrsalovic

342k55 gold badges341 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mike Dinescu · Accepted Answer · 2009-08-06 18:16:13Z

140

Try this:

/^stop.*$/

Explanation:

/ charachters delimit the regular expression (i.e. they are not part of the Regex per se)
^ means match at the beginning of the line
. followed by * means match any character (.), any number of times (*)
$ means to the end of the line

If you would like to enforce that stop be followed by a whitespace, you could modify the RegEx like so:

/^stop\s+.*$/

\s means any whitespace character
+ following the \s means there has to be at least one whitespace character following after the stop word

Note: Also keep in mind that the RegEx above requires that the stop word be followed by a space! So it wouldn't match a line that only contains: stop

edited Aug 6, 2009 at 18:16

answered Aug 6, 2009 at 18:07

Mike Dinescu

56k17 gold badges124 silver badges153 bronze badges

5 Comments

JAB Over a year ago

Not all languages use forwardslashes to delimit regexes.

Mike Dinescu Over a year ago

@Cat Megex: Which is precisely why I added the explanation. If your language uses something else to delimit the regex, replace the / with the proper character

MarredCheese Over a year ago

@Mez yes, and such redundancy increases both clarity and performance rexegg.com/regex-optimizations.html#anchors

Raul Chiarella Over a year ago

This didnt work for me.... Command: iostat -dx | awk ' $1 == "/^ada.*$/" { print $1 }'

Wiktor Stribiżew Over a year ago

Peter Mortensen · Accepted Answer · 2021-11-13 21:55:53Z

70

If you want to match anything after a word, stop, and not only at the start of the line, you may use: \bstop.*\b - word followed by a line.

Or if you want to match the word in the string, use \bstop[a-zA-Z]* - only the words starting with stop.

Or the start of lines with stop - ^stop[a-zA-Z]* for the word only - first word only. The whole line ^stop.* - first line of the string only.

And if you want to match every string starting with stop, including newlines, use: /^stop.*/s - multiline string starting with stop.

edited Nov 13, 2021 at 21:55

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Dec 10, 2015 at 10:38

Waxo

1,97617 silver badges26 bronze badges

1 Comment

Neil Gaetano Lindberg Over a year ago

If needing todo this with JS and some dynamic input causing you to need to use new RegExp it is important to know to escape the boundary (or any other flags), for example for the "stop" sample here: new RegExp('\\b' + varStringStop + '\\w')

Sedat Kilinc · Accepted Answer · 2023-03-02 09:46:40Z

29

Using the caret won't match every word beginning with "stop".

Only if it's at the beginning of a line like "stop going". @Waxo gave the right answer:

This one is slightly better, if you want to match any word beginning with "stop" and containing nothing but letters from A to Z.

\bstop[a-zA-Z]*\b

This would match all

stop (1)

stop random (2)

stopping (3)

want to stop (4)

please stop (5)

While

/^stop[a-zA-Z]*/

would only match (1) until (3), but not (4) & (5)

edited Mar 2, 2023 at 9:46

answered Dec 10, 2017 at 20:45

Sedat Kilinc

2,9711 gold badge26 silver badges20 bronze badges

Comments

Alex B · Accepted Answer · 2009-08-06 18:10:10Z

10

If you want to match anything that starts with "stop" including "stop going", "stop" and "stopping" use:

^stop

If you want to match the word stop followed by anything as in "stop going", "stop this", but not "stopped" and not "stopping" use:

^stop\W

answered Aug 6, 2009 at 18:10

Alex B

25k14 gold badges70 silver badges92 bronze badges

Comments

Mez · Accepted Answer · 2009-08-06 18:07:16Z

9

/stop([a-zA-Z])+/

Will match any stop word (stop, stopped, stopping, etc)

However, if you just want to match "stop" at the start of a string

/^stop/

will do :D

answered Aug 6, 2009 at 18:07

Mez

25k14 gold badges75 silver badges93 bronze badges

2 Comments

Alex B Over a year ago

This will match "don't stop going"

lostintranslation Over a year ago

This will not match stop123 or stop,.

Manisha Chaurasia · Accepted Answer · 2017-12-04 01:37:35Z

5

If you want the word to start with "stop", you can use the following pattern. "^stop.*"

This will match words starting with stop followed by anything.

answered Dec 4, 2017 at 1:37

Manisha Chaurasia

511 silver badge2 bronze badges

3 Comments

Stephen Rauch Over a year ago

Could you not just use "^stop"?

Manisha Chaurasia Over a year ago

It depends. While talking in terms of java syntax, we can use Pattern and Matcher object for using regex or direct use .matches() method with String object. They differ in result as below: code String line = "stopped"; String pattern = "^stop"; Pattern r = Pattern.compile(pattern); Matcher m = r.matcher(line); System.out.println(m.find( )); //prints true System.out.println(line.matches(pattern)); //prints false

Sedat Kilinc Over a year ago

This matches only if the word at the beginning of the line. If words beginning with "stop" are in the middle of the line or at the end, this regex won't match. @StephenRauch if you omit [a-z]* you wouldn't get any words like "stopping" in whole. In the case of "stopping" you get "stop" and "ping" would be missing.

Peter Mortensen · Accepted Answer · 2021-11-13 22:17:35Z

3

/^stop*$/i

i - in case it is case sensitive.

edited Nov 13, 2021 at 22:17

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Sep 28, 2021 at 16:11

Charles Kasasira

3202 gold badges5 silver badges17 bronze badges

1 Comment

General Grievance Over a year ago

p* means any number of ps and only ps, including 0.

Robert Elwell · Accepted Answer · 2009-08-06 18:22:08Z

I'd advise against a simple regular expression approach to this problem. There are too many words that are substrings of other unrelated words, and you'll probably drive yourself crazy trying to overadapt the simpler solutions already provided.

You'll want at least a naive stemming algorithm (try the Porter stemmer; there's available, free code in most languages) to process text first. Keep this processed text and the preprocessed text in two separate space-split arrays. Make sure each non-alphabetical character also gets its own index in this array. Whatever list of words you're filtering, stem them also.

The next step would be to find the array indices which match to your list of stemmed 'stop' words. Remove those from the unprocessed array, and then rejoin on spaces.

This is only slightly more complicated, but will be much more reliable an approach. If you've got any doubts on the value of a more NLP-oriented approach, you might want to do some research into clbuttic mistakes.

3gwebtrain · Accepted Answer · 2022-10-11 05:56:17Z

1

can you try this:

https://regex101.com/r/P3qfKG/1

reg = /stop(\w+| [^ ]+|$)/gm

it will select both stop and start with stop and next word;

answered Oct 11, 2022 at 5:56

3gwebtrain

15.5k30 gold badges141 silver badges281 bronze badges

1 Comment

3gwebtrain Over a year ago

regex101.com/r/P3qfKG/1

Collectives™ on Stack Overflow

Regular expression to match string starting with a specific word

10 Answers 10

Comments

5 Comments

1 Comment

Comments

Comments

2 Comments

3 Comments

1 Comment

Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

Comments

5 Comments

1 Comment

Comments

Comments

2 Comments

3 Comments

1 Comment

Comments

1 Comment

Linked

Related