2

I've a long text in Java, which contains at least one markdown image syntax. If there're N markdown image syntax, I will need to split the string into N+1 substrings and store them in an array of String, call texts. For example, I've the following text

Hello world!
![Alt text](/1/2/3.jpg)
Hello Stack Overflow!

Then Hello world!\n will be stored in position 0 and \nHello Stack Overflow! will be stored in position 1. For my question, we can assume that

  • The Alt text part contains only character A-Z, a-z and blank space.
  • The URL part contains only digits 0-9 and slash /. Its extension will only be .jpg. Other extension will not exist.

My question is how to split the text ? Do we need a java regular expression, such as *![*](*.jpg) ?

2
  • A regex, sure - why not. Is your regex notation different than the standard one? Commented Apr 3, 2016 at 22:27
  • No, my regex notation supposes to be same as the standard one. If there's error, it's my fault. (I don't know much about regular expression) Commented Apr 3, 2016 at 22:29

3 Answers 3

11

Try this (ready to copy-paste):

"!\\[[^\\]]+\\]\\([^)]+\\)"

See here for info about how to get the matches.

"Untainted" version: !\[[^\]]+\]\([^)]+\)

Explanation

  • ! literally !
  • \[ escaped [
  • [^\]]+ as many not ]s as possible
  • \]\( escaped ](
  • [^)]+ as many not )s as possible
  • \) escaped )
Sign up to request clarification or add additional context in comments.

3 Comments

@MincongHuang Added! I explained the "untainted" version.
Wonderful explanation. I've learnt a lot from it, thank you @Laurel
Actually, the escaped content (markdown) are useful for me. Can I get them and put them into other string array ?
0

This is my way

public class Test {

public static void main(String[] args) {
    // TODO Auto-generated method stub
     List<String> allMatches = new ArrayList<String>();
     String str = "}```![imageName](/sword?SwordControllerName=KMFileDownloadController&id=c60b6c5a8d9b46baa1dc266910db462d \"imageName\")#### JSON data";
     Matcher m = Pattern.compile("\\[.*\\]\\((.*)\\)").matcher(str);
     while (m.find()) {
         allMatches.add(m.group(1).split(" ")[0]);
     }
     //print "/sword?SwordControllerName=KMFileDownloadController&id=c60b6c5a8d9b46baa1dc266910db462d"
     for(String s:allMatches){
         System.out.println(s);
     }
  }
}

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

Comments

0
!\[[^\]]*?\]\([^)]+\)

That way Alt Text can stay empty - though it makes no sense

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.