2

Is there a simple solution to parse a String by using regex in Java?

I have to adapt a HTML page. Therefore I have to parse several strings, e.g.:

href="/browse/PJBUGS-911"
=>
href="PJBUGS-911.html"

The pattern of the strings is only different corresponding to the ID (e.g. 911). My first idea looks like this:

String input = "";
String output = input.replaceAll("href=\"/browse/PJBUGS\\-[0-9]*\"", "href=\"PJBUGS-???.html\"");

I want to replace everything except the ID. How can I do this?

Would be nice if someone can help me :)

3 Answers 3

3

You can capture substrings that were matched by your pattern, using parentheses. And then you can use the captured things in the replacement with $n where n is the number of the set of parentheses (counting opening parentheses from left to right). For your example:

String output = input.replaceAll("href=\"/browse/PJBUGS-([0-9]*)\"", "href=\"PJBUGS-$1.html\"");

Or if you want:

String output = input.replaceAll("href=\"/browse/(PJBUGS-[0-9]*)\"", "href=\"$1.html\"");
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your very quick answer and solution. Works fine :-)
1

This does not use regexp. But maybe it still solves your problem.

output = "href=\"" + input.substring(input.lastIndexOf("/")) + ".html\"";

5 Comments

Don't forget to add ".html" to the end
Pretty simple and straightforward this is.
@Vulcan Yes there is. He requires it for his answer.
I believe input is not a single href="/browse/..." but a whole HTML file. Hence, the explicit mentioning of replaceAll in the question.
Thanks for the edit. And yes you're probably right @m.buettner
0

This is how I would do it:

public static void main(String[] args) 
    {
        String text = "href=\"/browse/PJBUGS-911\" blahblah href=\"/browse/PJBUGS-111\" " +
                "blahblah href=\"/browse/PJBUGS-34234\"";

        Pattern ptrn = Pattern.compile("href=\"/browse/(PJBUGS-[0-9]+?)\"");

        Matcher mtchr = ptrn.matcher(text);

        while(mtchr.find())
        {
            String match = mtchr.group(0);
            String insMatch = mtchr.group(1);



            String repl = match.replaceFirst(match, "href=\"" + insMatch + ".html\"");

            System.out.println("orig = <" + match + "> repl = <" + repl + ">");
        }
    }

This just shows the regex and replacements, not the final formatted text, which you can get by using Matcher.replaceAll:

String allRepl = mtchr.replaceAll("href=\"$1.html\"");

If just interested in replacing all, you don't need the loop -- I used it just for debugging/showing how regex does business.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.