10

I have the below java string in the below format.

String s = "City: [name:NYK][distance:1100] [name:CLT][distance:2300] [name:KTY][distance:3540] Price:"

Using the java.util.regex package matter and pattern classes I have to get the output string int the following format:

Output: [NYK:1100][CLT:2300][KTY:3540]

Can you suggest a RegEx pattern which can help me get the above output format?

1
  • 5
    Have you tried something already? Commented Jul 18, 2017 at 19:41

2 Answers 2

19

You can use this regex \[name:([A-Z]+)\]\[distance:(\d+)\] with Pattern like this :

String regex = "\\[name:([A-Z]+)\\]\\[distance:(\\d+)\\]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(s);

StringBuilder result = new StringBuilder();
while (matcher.find()) {                                                
    result.append("[");
    result.append(matcher.group(1));
    result.append(":");
    result.append(matcher.group(2));
    result.append("]");
}

System.out.println(result.toString());

Output

[NYK:1100][CLT:2300][KTY:3540]
  • regex demo
  • \[name:([A-Z]+)\]\[distance:(\d+)\] mean get two groups one the upper letters after the \[name:([A-Z]+)\] the second get the number after \[distance:(\d+)\]

Another solution from @tradeJmark you can use this regex :

String regex = "\\[name:(?<name>[A-Z]+)\\]\\[distance:(?<distance>\\d+)\\]";

So you can easily get the results of each group by the name of group instead of the index like this :

while (matcher.find()) {                                                
    result.append("[");
    result.append(matcher.group("name"));
    //----------------------------^^
    result.append(":");
    result.append(matcher.group("distance"));
    //------------------------------^^
    result.append("]");
}
Sign up to request clarification or add additional context in comments.

3 Comments

Pretty much exactly how I would have done it. My only addition (and this is somewhat a matter of preference) is that the capture groups could be named, e.g. name:(?<name>[A-Z]+), and then accessed as matcher.group("name"). Just makes it really clear at the access point exactly what segment you're trying to access.
this is good way @tradeJmark to use name:(?<name>[A-Z]+), and then accessed as matcher.group("name") honestly it is the first time i see it, i already test it and it work very well String regex = "\\[name:(?<name>[A-Z]+)\\]\\[distance:(?<distance>\\d+)\\]"; i will add to my answer if you don't mind :)
sure, definitely add it, if you like it.
4

If the format of the string is fixed, and you always have just 3 [...] groups inside to deal with, you may define a block that matches [name:...] and captures the 2 parts into separate groups and use a quite simple code with .replaceAll:

String s = "City: [name:NYK][distance:1100] [name:CLT][distance:2300] [name:KTY][distance:3540] Price:";
String matchingBlock = "\\s*\\[name:([A-Z]+)]\\[distance:(\\d+)]";
String res = s.replaceAll(String.format(".*%1$s%1$s%1$s.*", matchingBlock), 
    "[$1:$2][$3:$4][$5:$6]");
System.out.println(res); // [NYK:1100][CLT:2300][KTY:3540]

See the Java demo and a regex demo.

The block pattern matches:

  • \\s* - 0+ whitespaces
  • \\[name: - a literal [name: substring
  • ([A-Z]+) - Group n capturing 1 or more uppercase ASCII chars (\\w+ can also be used)
  • ]\\[distance: - a literal ][distance: substring
  • (\\d+) - Group m capturing 1 or more digits
  • ] - a ] symbol.

In the .*%1$s%1$s%1$s.* pattern, the groups will have 1 to 6 IDs (referred to with $1 - $6 backreferences from the replacement pattern) and the leading and final .* will remove start and end of the string (add (?s) at the start of the pattern if the string can contain line breaks).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.