I am trying to extract both XML tags and text within tags using regex. I understand using regex is not the best option. I only have very few tags in my inline text file hence did not opt for XML parsers.
String txt="American Airlines made <TRIPS> 100 </TRIPS> flights in <DATE> December </DATE> over <ROUTE> Altantic </ROUTE> "; String re1="<([^>]+)>"; // Tag 1 String re2="([^<]*)"; // Variable Name 1 String re3="</([^>]+)>"; // Tag 2 // String re3 = re1; Pattern p = Pattern.compile(re1+re2+re3,Pattern.CASE_INSENSITIVE | Pattern.DOTALL); Matcher m = p.matcher(txt); if (m.find()) { String tag1=m.group(1); String var1=m.group(2); System.out.println(tag1.toString()); System.out.println(var1.toString()); } The problem is that, it only identifies the first tag and not the second one or subsequent ones.
Current Output
TRIPS 100 Desired Output
TRIPS 100 DATE December ROUTE Altantic
<([^>]*)>(.*?)<\/\1>& extract second group.if (m.find())towhile (m.find())<TRIPS> 100 </TRIPS>, and use the commented-out version of re3. Otherwise you will not be able to match the other elements that are properly closed.