9

I have an application which needs some verifications for some fields. One of them is for a last name which can be composed of 2 words. In my regex, I have to accept these spaces so I tried a lot of things but I did'nt find any solution.

Here is my regex :

@"^[a-zA-Zàéèêçñ\s][a-zA-Zàéèêçñ-\s]+$"

The \s are normally for the spaces but it does not work and I got this error message :

parsing "^[a-zA-Zàéèêçñ\s][a-zA-Zàéèêçñ-\s]+$" - Cannot include class \s in character range.

ANy idea guys?

1
  • 1
    Other topic, but have a look into Unicode properties. \p{L}, this is matching a letter in any language, so your expression would look like @"^[\p{L}\s][\p{L}\s-]+$" is a lot nicer and you don't have to think about each special letter. Commented Apr 18, 2013 at 8:35

3 Answers 3

16

- denotes a character range, just as you use A-Z to describe any character between A and Z. Your regex uses ñ-\s which the engine tries to interpret as any character between ñ and \s -- and then notices, that \s doesn't make a whole lot of sense there, because \s itself is only an abbreviation for any whitespace character.

That's where the error comes from.

To get rid of this, you should always put - at the end of your character class, if you want to include the - literal character:

@"^[a-zA-Zàéèêçñ\s][a-zA-Zàéèêçñ\s-]+$"

This way, the engine knows that \s- is not a character range, but the two characters \s and - seperately.

The other way is to escape the - character:

@"^[a-zA-Zàéèêçñ\s][a-zA-Zàéèêç\-\s]+$"

So now the engine interprets ñ\-\s not as a character range, but as any of the characters ñ, - or \s. Personally, though I always try to avoid escaping as often as possible, because IMHO it clutters up and needlessly stretches the expression in length.

Sign up to request clarification or add additional context in comments.

2 Comments

Escaping is less brittle. Say you have a character class for operations: [+-]. Another programmer may change it to [+-*/], breaking the pattern.
I agree, but you can argue that in any way. Say you have a pattern [+\-*] because you can't do divisions. Some day you can do it, and another programmer changes it to [+/-*] because he thinks you just got the slash the wrong way around. Off goes your escaping. So, this is really not an argument for any of the ways. I just value readability a little more, especially in regex because they're complicated enough as it is.
4

You need to escape the last - character - ñ-\s is parsed like the range a-z:

@"^[a-zA-Zàéèêçñ\s][a-zA-Zàéèêçñ\-\s]+$"

See also on Regex Storm: [a-\s] , [a\-\s]

Comments

0

[RegularExpression(@"^[a-zA-Z\s]+$", ErrorMessage = "Only alphabetic characters and spaces are allowed.")]

This works

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.