2

I have the following Java code that is supposed to extract a url from a String object

public static void main() {
    String text = "Link to https://some.domain.com/subfolder?sometext is     available";
    String regex = "https://some\\.domain\\.com/subfolder[^ ]*";
    Pattern urlPattern = Pattern.compile(regex);

    Matcher m = urlPattern.matcher(text);

    String url = m.group();

    System.out.println(url);

    return;
}

However, there is no match and the code fails with IllegalStateException.

What is wrong with the RegEx?

8
  • 1
    String regex = "https:\/\/some\\.domain\\.com/subfolder[^ ]*"; Commented Aug 29, 2016 at 11:24
  • @lordkain Why do you want to escape the slashes? Commented Aug 29, 2016 at 11:25
  • @lordkain that is an illegal escape sequence. Besides, I also tried simply https.* which also fails. Commented Aug 29, 2016 at 11:27
  • I don't think you can use 'group' without calling find or matches methods first. Commented Aug 29, 2016 at 11:28
  • if(m.matches()){ String url = m.group(); System.out.println(url); } IllegalStateException - If no match has yet been attempted, or if the previous match operation failed Commented Aug 29, 2016 at 11:31

3 Answers 3

7

You can't ask a Matcher to give a .group() unless you have called a method which asks the Matcher to operate on the input: one of .find() (preferred), .lookingAt() or .matches().

This is why you get an IllegalStateException.

As to the differences between the three, while the javadoc tells it all, just a quick reminder:

  • .find() does "real" regex matching: it will try and match the regex anywhere in the input text;
  • .lookingAt() adds the constraint that the pattern should match at the beginning of the input text;
  • .matches() is a misnomer since in addition to the constraint imposed by .lookingAt(), it also required that the full input text (the "entire region" in the javadoc) is matched.

Please also recall that those three methods return a boolean depending on whether the match was successful; if the result is false, you can't .group().

Sign up to request clarification or add additional context in comments.

Comments

3

You forgot to call m.find() or m.matches(). This is mandatory, otherwise group() does not work.

The find() should return true if the pattern is matched. Only in this case group() will return what you are expecting.

So, modify your code as following:

....
if (!m.find()) {
    return;
}
String url = m.group();
...

EDIT Concerning to what method to call: find() or matches(). find() looks for the pattern in part of string, matches() matches full string. They relate like contains() and equals() of strings.

I personally prefer to use find() because in this case the regex fully defines the behavior. If I want to match full string I use ^ and $.

1 Comment

You forgot .lookingAt()
2

Since m.group()

Returns the input subsequence matched by the previous match.

you have to call m.matches() or m.find() before using m.group().

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.