60

I try to get string between <%= and %>, here is my implementation:

String str = "ZZZZL <%= dsn %> AFFF <%= AFG %>";
Pattern pattern = Pattern.compile("<%=(.*?)%>");
String[] result = pattern.split(str);
System.out.println(Arrays.toString(result));

it return

[ZZZZL ,  AFFF ]

But my expectation is:

[ dsn , AFG ]

Where am i wrong and how to correct it ?

1
  • 5
    It's appears you're confusing splitting a string with pattern matching. Commented May 16, 2013 at 20:57

4 Answers 4

96

Your pattern is fine. But you shouldn't be split()ting it away, you should find() it. Following code gives the output you are looking for:

String str = "ZZZZL <%= dsn %> AFFF <%= AFG %>";
Pattern pattern = Pattern.compile("<%=(.*?)%>", Pattern.DOTALL);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
    System.out.println(matcher.group(1));
}
Sign up to request clarification or add additional context in comments.

1 Comment

I added Pattern.DOTALL for those quite common cases when the expected match spans across multiple lines.
60

I have answered this question here: https://stackoverflow.com/a/38238785/1773972

Basically use

StringUtils.substringBetween(str, "<%=", "%>");

This requirs using "Apache commons lang" library: https://mvnrepository.com/artifact/org.apache.commons/commons-lang3/3.4

This library has a lot of useful methods for working with string, you will really benefit from exploring this library in other areas of your java code !!!

3 Comments

I seriously dont understand why do we need to add a new dependency jar/library all together in the project just to acheive a small trivial funcionality
Adding dependencies has become a joke :)
@Stunner, you are right, it's overkill. But this answer is mostly for those who already have commons-lang3 in their classpath.
8

Jlordo approach covers specific situation. If you try to build an abstract method out of it, you can face a difficulty to check if 'textFrom' is before 'textTo'. Otherwise method can return a match for some other occurance of 'textFrom' in text.

Here is a ready-to-go abstract method that covers this disadvantage:

  /**
   * Get text between two strings. Passed limiting strings are not 
   * included into result.
   *
   * @param text     Text to search in.
   * @param textFrom Text to start cutting from (exclusive).
   * @param textTo   Text to stop cuutting at (exclusive).
   */
  public static String getBetweenStrings(
    String text,
    String textFrom,
    String textTo) {

    String result = "";

    // Cut the beginning of the text to not occasionally meet a      
    // 'textTo' value in it:
    result =
      text.substring(
        text.indexOf(textFrom) + textFrom.length(),
        text.length());

    // Cut the excessive ending of the text:
    result =
      result.substring(
        0,
        result.indexOf(textTo));

    return result;
  }

2 Comments

This will not return multiple occurrences when there are multiple matches. Only the first one.
Though it returns only first match, fulfills my need.
5

Your regex looks correct, but you're splitting with it instead of matching with it. You want something like this:

// Untested code
Matcher matcher = Pattern.compile("<%=(.*?)%>").matcher(str);
while (matcher.find()) {
    System.out.println(matcher.group());
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.