1

I want to development a regular expresion to match the tag :

<claim-text>aaaaaaa <claim-text>bbbbbbb</claim-text> <claim-text>ccccccc</claim-text> </claim-text> 

I tried

<claim-text>(.*)</claim-text> 

But, only bbbbbbb and ccccccc can be matched. Can I get some help to cover aaaaaaa also?

Thanks

3
  • Activate the s/single line/'. matches \n' flag. Commented Aug 12, 2019 at 13:07
  • s.split(/\s*<\/?claim-text>\s*/).filter(Boolean) Commented Aug 12, 2019 at 13:18
  • What should the result be if your tag structure looked like <claim-text>aaaaaaa<claim-text>bbbbbbb</claim-text><claim-text>ccccccc</claim-text>ddddddd</claim-text>? Commented Aug 12, 2019 at 13:24

2 Answers 2

1

For a generic solution with any depth, you will at least need a stack, which not available for most regular expression implementation. However, if you know the structure will only have the depth you specified, you could use something like this:

<claim-text>([^<\r\n]*) 

You can see a working example here: https://regex101.com/r/kbDbwF/1

It will search for your opening tag, and then find anything up to the next opening or closing tag [^<], or to the next line break [^\r\n]. I have combined both character classes to one definition [^<\r\n]. However, this is not a general solution!

Sign up to request clarification or add additional context in comments.

Comments

1

Do not under any circumstances try to parse HTML with a regex unless you wish to invoke rite 666 Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn.

Use an HTML parsing library see this page for some ways to do it.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.