0

Whenever I enter the following...

Pattern pmessage = Pattern.compile("\s*\p{Alnum}[\p{Alnum}\s]*");
Matcher mmessage = pmessage.matcher(message);
Matcher msubject = pmessage.matcher(subject);

I get a Invalid Escape Sequence error. Anyone have any idea why / how I fix this?

2
  • Be warned that even corrected for ddoouubbllee bbaacckkssllllaasshheess, that doesn’t work with Java native characters, only with ASCII. Commented Dec 3, 2010 at 12:45
  • Shouldn't that be bbaacckkssllaasshheess instead of bbaacckkssllllaasshheess? :) Commented Dec 3, 2010 at 13:00

4 Answers 4

2

For a version of \p{Alpha} that works on the Java native character set instead being stuck unsable to process anything else than legacy data from the 1960s, you need to use

alphabetics = "[\\pL\\pM\\p{Nl]";

For a version of numerics in the same sense, you have to choose which of these you want:

ASCII_digits    = "[0-9]";
all_numbers     = "\\pN";
decimal_numbers = "\\p{Nd}"

because which one applies various depending on circumstances. We’ll assume you copied one of those three to a numeric variable.

Assuming you then want alphanumerics based on the definition above, you could then write:

 alphanumerics = "[" + alphabetics + numerics + "]";

However, if what you mean by alphanumerics is the \w sense of program identifiers, you have to add some stuff.

 identifier_chars = "[\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}[\\p{InEnclosedAlphanumerics}&&\\p{So}]]";

This issue is discussed at length in this answer, where you’ll also find a link to some alpha code of mine that does these transforms for you automatically. I hope to get a chance to rewrite it to take up less space this weekend.

Sign up to request clarification or add additional context in comments.

Comments

1

Double each backslash: Pattern.compile("\\s*\\p{Alnum}[\\p{Alnum}\\s]*")

Backslashes inside string literals have a special meaning, and have to be duplicated in order for the actual backslash character to become part of the string (which is what is required in your regex example.)

Comments

1

Keep in mind, that backslashes are special characters in Java strings, that need to be escaped with an additional backslash:

Pattern.compile("\\s*\\p{Alnum}[\\p{Alnum}\\s]*");

Comments

1

You didn't correctly escape your "\" characters : in java, "\s" will give you \s, so you should write :

Pattern.compile("\\s*\\p{Alnum}[\\p{Alnum}\\s]*");

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.