0

I want to write a function to extract various number of values from a String according to a regex pattern:

Here is my function code:

/**
 * Get substrings in a string using groups in regular expression.
 * 
 * @param str
 * @param regex
 * @return
 */
public static String[] regexMatch(String str, String regex) {
    String[] rtn = null;
    if (str != null && regex != null) {
        Pattern pat = Pattern.compile(regex);
        Matcher matcher = pat.matcher(str);
        if (matcher.find()) {
            int nGroup = matcher.groupCount();
            rtn = new String[nGroup];
            for (int i = 0; i < nGroup; i++) {
                rtn[i] = matcher.group(i);
            }
        }
    }
    return rtn;
}

When I test it using:

String str = "nets-(90000,5,4).dat";
String regex = "(\\d+),(\\d+),(\\d+)";
String[] rtn = regexMatch(str, regex);

I get:

rtn: [90000,5,4,90000,5]

How can I get rtn to be [90000,5,4] as I expected?

1
  • You could restructure your solution a little by removing everything except commas and numbers and then split on commas. However, this depends on the other strings you process having a similar structure. Commented May 15, 2014 at 22:10

2 Answers 2

1

Your array currently store

[0] -> 90000,5,4
[1] -> 90000
[2] -> 5

That is why you are seeing as output [90000,5,4,90000,5]. It is because group(0) represents entire match so it returns 90000,5,4.

What you need is match from groups 1, 2 and 3.

(\\d+),(\\d+),(\\d+)
   1      2      3

So change

rtn[i] = matcher.group(i);

to

rtn[i] = matcher.group(i+1);
Sign up to request clarification or add additional context in comments.

Comments

1

First, I would start the for loop with 1 so you can get the grouping you are declaring in your regex. The loop should look like this:

for (int i = 1; i <= nGroup; i++) {
            rtn[i] = matcher.group(i);
        }

Group 0 is known to be the entire matching string for your regex. The grouping is from:

String regex = "(\\d+),(\\d+),(\\d+)";

You would say matcher.group(1), matcher.group(2), and matcher.group(3) will give you what you want.

4 Comments

If OP will start loop with 1 then returned array will not fill [0] and you will try to fill [length]. Unless you will write it as rtn[i-1] = matcher.group(i); which will be variation of my answer.
this would produce an IndexOutOfBoundsException
I should start from 0 to nGroup-1 I feel.
true dat, matcher.groupCount() doesnt include index 0 as part of it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.