I'm new to regex so I hope that there is an easy problem to spot in this. Basically I'm trying to check if user input matches a word or phrase containing letters or numbers.
([a-zA-Z0-9]+)|(([a-zA-Z0-9]+\s+[a-zA-Z0-9]+)+)\s*
Hope it's not too messy, all help is very much appreciated.
It works for a phrase like "the man is a" but not for "the man is a dog" which puzzles me.
2 Answers
It should work for both of your phrases and capture the both times, as your regex won't evaluate your second alternation, as the first one ([a-zA-Z0-9]+) will match every time (see here).
If all you want to do is match a word or phrase containing letters or numbers I'd use
(?:[a-zA-Z0-9]+\s?)+
I won't use \w instead of [a-zA-Z0-9], as \w matches on _ in many regex-engines, too.
Explanation:
(?: #start of non-capturing group
[a-zA-Z0-9] #character class matching one of a-z, A-Z and 0-9
+ #between one and unlimited times
\s? #optional whitespace (includes tabs newlines)
) #end of non-capturing group
3 Comments
Alvin Bunk
What the ":" after the ? mark used for in this Regex Basti? I always have a tough time with Regexs as well.
KeyNone
@AlvinBunk
() is a usual capturing group, while (?:) is a non-capturing group, so to only group tokens and not capture them.Alvin Bunk
Great, I think this answers the OPs question.
([\w\s]+)