3

I am exploring Regular expressions.

Problem statement : Replace String between # and # with the values provided in replacements map.

import java.util.regex.*;
import java.util.*;

public class RegExTest {
    public static void main(String args[]){

        HashMap<String,String> replacements = new HashMap<String,String>();
        replacements.put("OldString1","NewString1");
        replacements.put("OldString2","NewString2");
        replacements.put("OldString3","NewString3");

        String source = "#OldString1##OldString2#_ABCDEF_#OldString3#";

        Pattern pattern = Pattern.compile("\\#(.+?)\\#");
        //Pattern pattern = Pattern.compile("\\#\\#");
        Matcher matcher = pattern.matcher(source);
        StringBuffer buffer = new StringBuffer();
        while (matcher.find()) {
            matcher.appendReplacement(buffer, "");
            buffer.append(replacements.get(matcher.group(1)));            
        }
        matcher.appendTail(buffer);
        System.out.println("OLD_String:"+source);
        System.out.println("NEW_String:"+buffer.toString());

    }
}

Output: ( Caters to my requirement but does not know who group(1) command works)

OLD_String:#OldString1##OldString2#_ABCDEF_#OldString3#
NEW_String:NewString1NewString2_ABCDEF_NewString3

If I change the code as below

Pattern pattern = Pattern.compile("\\#(.+?)\\#");

with

Pattern pattern = Pattern.compile("\\#\\#");

I am getting below error:

Exception in thread "main" java.lang.IndexOutOfBoundsException: No group 1

I did not understand difference between

"\\#(.+?)\\#" and `"\\#\\#"`

Can you explain the difference?

1

2 Answers 2

2

The difference is fairly straightforward - \\#(.+?)\\# will match two hashes with one or more chars between them, while \\#\\# will match two hashes next to each other.

A more powerful question, to my mind, is "what is the difference between \\#(.+?)\\# and \\#.+?\\#?"

In this case, what's different is what is (or isn't) getting captured. Brackets in a regex indicate a capture group - basically, some substring you want to output separately from the overall matched string. In this case, you're capturing the text in between the hashes - the first pattern will capture and output it separately, while the second will not. Try it yourself - asking for matcher.group(1) on the first will return that text, while the second will produce an exception, even though they both match the same text.

Sign up to request clarification or add additional context in comments.

Comments

0

.+? Tells it to match (one or more of) anything lazily (until it sees a #). So as soon as it parses one instance of something, it stops.

I think the \#\# would match ## so i think the error is because it only matches that one ## and then there's only a group 0, no group 1. But not 100% on that part.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.