0

I essentially have a String representation of a comma separated list. However, each individual element is also comma separated, so the String was modified to have each element surrounded by '<' and '>'. I am trying to use regex to capture each element and add it to a list, thus making it a List of elements, rather than a String of a List.

Here are some example String inputs:

"<>"         // should match regex, but will be thrown out
"<a=1>"
"<a=1,b=1>"
"<a=1,b=1>,<a=2,b=2>"
"<a=1,b=1>,<a=2,b=2>,<a=3,b=3,c=3>,<a=4>"

The corresponding outputs I would like would be lists like so:

["a=1"]
["a=1,b=1"]
["a=1,b=1","a=2,b=2"]
["a=1,b=1","a=2,b=2","a=3,b=3,c=3","a=4"]

The pattern I am trying to use is:

Pattern pattern = Pattern.compile("<([^>]*)>(,<([^>]*)>)*");

But when I try to create the list, it is not handling each additional occurrence as a new group.

Matcher matcher = pattern.matcher(myString);
if (matcher.matches()) {
    List<String> listOfElements = new ArrayList<>();
    for (int i = 1; i <= matcher.groupCount(); i++) { // group 0 represents the entire String, so start at index 1
        if (matcher.group(i) != null) {
            listOfElements.add(matcher.group(i));
        }
    }
    System.out.println(listOfElements);
}

The result of the above test cases are:

["a=1"]
["a=1,b=1"]
["a=1,b=1", ",<a=2,b=2>", "a=2,b=2"]
["a=1,b=1", ",<a=4>", "a=4"]

Note: I added the quotes to that result for readability to separate out the values in the list - obviously the System.out.println() does not write out the quotes.

What is the proper regex to do this? Or if there is a better way than using regex, I'd be happy to hear, though keep in mind that I'd prefer to not have to use a third party package.

2
  • Can you maybe just drop the initial < and the final >, then split the rest on >,< ? That'll give you what you want in an array, and you can then use Arrays.asList. Commented Jun 5, 2018 at 23:39
  • @DawoodibnKareem ha, well that does indeed work, but seems like too much of a hack Commented Jun 5, 2018 at 23:54

2 Answers 2

2

Match the entries one by one with find instead of matches.

Pattern pattern = Pattern.compile("<([^>]*)>");
Matcher matcher = pattern.matcher(myString);
List<String> listOfElements = new ArrayList<>();

while (matcher.find()) {
    listOfElements.add(matcher.group(1));
}
System.out.println(listOfElements);
Sign up to request clarification or add additional context in comments.

2 Comments

This is definitely close to what I'm looking for, but the return values are keeping the brackets for some reason (maybe since it's just group()?). I could technically just replace the brackets with nothing, but there's prob a better way
@Sean Made a typo. Try it again.
1

You can do it in one line by splitting using look arounds:

String[] parts = str.split("(?<=>),(?=<)");

The regex splits on commas that are preceded by > and followed by <, without consuming the angle brackets.

If you really need a List:

List<String> parts = Arrays.asList(str.split("(?<=>),(?=<)"));

1 Comment

sure enough, that works. I like the regex that Leo gave a little better, but it's still good to know about this way! thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.