0

I have the following entry in a properties file:

some.key = \n
  [1:Some value] \n
  [14:Some other value] \n
  [834:Yet another value] \n

I am trying to parse it using a regular expression, but I can't seem to get the grouping correct. I am trying to print out a key/value for each entry. Example: Key="834", Value="Yet another value"

private static final String REGEX_PATTERN = "[(\\d+)\\:(\\w+(\\s)*)]+";

private void foo(String propValue){
    final Pattern p = Pattern.compile(REGEX_PATTERN);
    final Matcher m = p.matcher(propValue);
    while (m.find()) {
        final String key = m.group(0).trim();
        final String value = m.group(1).trim();
        System.out.println(String.format("Key[%s] Value[%s]", key, value));            
    }
}

The error I get is:

Exception: java.lang.IndexOutOfBoundsException: No group 1

I thought I was grouping correctly in the regex but I guess not. Any help would be appreciated!

Thanks

UPDATE: Escaping the brackets worked. Changed the pattern to the followingThanks for the feedback!

 private static final String REGEX_PATTERN = "\\[(\\d+)\\:(\\w+(\\w|\\s)*)\\]+";
2
  • Index out of bounds: checking for an element in the array that does not exist. Probably pointed to an unsetted index Commented May 17, 2012 at 13:18
  • 1
    Alfabravo - Yes, I am aware, but I am curious why the regular expression is incorrect Commented May 17, 2012 at 13:20

4 Answers 4

2

[ should be escaped (as well as ]).

"\\[(\\d+)....\\]+"

[] Is used for character classes: [0-9] == (0|1|2|...|9)

Sign up to request clarification or add additional context in comments.

Comments

2

Try this:

private static final String REGEX_PATTERN = "\\[(\\d+):([\\w\\s]+)\\]";

final Pattern p = Pattern.compile(REGEX_PATTERN);
final Matcher m = p.matcher(propValue);
while (m.find()) {
    final String key = m.group(1).trim();
    final String value = m.group(2).trim();
    System.out.println(String.format("Key[%s] Value[%s]", key, value));
}
  1. the [ and ] need to be escaped because they represent the start and end of a character class
  2. group(0) is always the full match, so your groups should start with 1
  3. note how I wrote the second group [\\w\\s]+. This means a character class of word or whitespace characters

Comments

1

It's your regex, [] are special characters and need to be escaped if you want to interpret them literally.

Try

"\\[(\\d+)\\:(\\w+(\\s)*)\\]"

Note - I removed the '+'. The matcher will keep finding substrings that match the pattern so the + is not necessary. (You might need to feed in a GLOBAL switch - I can't remember).

I can't help but feel this might be simpler without regex though, perhaps by splitting on \n or [ and then splitting on : for each of those.

Comments

0

Since you are using string that consists of several lines you should tell it to Pattern:

final Pattern p = Pattern.compile(REGEX_PATTERN, Pattern.MULTILINE);

Although it is irrelevant directly for you I'd recommend you to add DOTALL too:

final Pattern p = Pattern.compile(REGEX_PATTERN, Pattern.MULTILINE | Pattern.DOTALL);

1 Comment

The multiline thing is only important if you want to detect start/end of line. Probably a good idea to do so, though.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.