2

I've a long text in Java, which contains at least one markdown image syntax. If there're N markdown image syntax, I will need to split the string into N+1 substrings and store them in an array of String, call texts. For example, I've the following text

Hello world! ![Alt text](/1/2/3.jpg) Hello Stack Overflow! 

Then Hello world!\n will be stored in position 0 and \nHello Stack Overflow! will be stored in position 1. For my question, we can assume that

  • The Alt text part contains only character A-Z, a-z and blank space.
  • The URL part contains only digits 0-9 and slash /. Its extension will only be .jpg. Other extension will not exist.

My question is how to split the text ? Do we need a java regular expression, such as *![*](*.jpg) ?

2
  • A regex, sure - why not. Is your regex notation different than the standard one? Commented Apr 3, 2016 at 22:27
  • No, my regex notation supposes to be same as the standard one. If there's error, it's my fault. (I don't know much about regular expression) Commented Apr 3, 2016 at 22:29

3 Answers 3

11

Try this (ready to copy-paste):

"!\\[[^\\]]+\\]\\([^)]+\\)"

See here for info about how to get the matches.

"Untainted" version: !\[[^\]]+\]\([^)]+\)

Explanation

  • ! literally !
  • \[ escaped [
  • [^\]]+ as many not ]s as possible
  • \]\( escaped ](
  • [^)]+ as many not )s as possible
  • \) escaped )
Sign up to request clarification or add additional context in comments.

3 Comments

@MincongHuang Added! I explained the "untainted" version.
Wonderful explanation. I've learnt a lot from it, thank you @Laurel
Actually, the escaped content (markdown) are useful for me. Can I get them and put them into other string array ?
0

This is my way

public class Test { public static void main(String[] args) { // TODO Auto-generated method stub List<String> allMatches = new ArrayList<String>(); String str = "}```![imageName](/sword?SwordControllerName=KMFileDownloadController&id=c60b6c5a8d9b46baa1dc266910db462d \"imageName\")#### JSON data"; Matcher m = Pattern.compile("\\[.*\\]\\((.*)\\)").matcher(str); while (m.find()) { allMatches.add(m.group(1).split(" ")[0]); } //print "/sword?SwordControllerName=KMFileDownloadController&id=c60b6c5a8d9b46baa1dc266910db462d" for(String s:allMatches){ System.out.println(s); } } } 

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

Comments

0
!\[[^\]]*?\]\([^)]+\) 

That way Alt Text can stay empty - though it makes no sense

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.