1

In my .NET app, I am required to parse text, which can have inline conditions, like this:

Here is some text. {{if: condition }} Here is some conditional text. {{endif}} Here is more text.

And so I have written the following regular expression to find these conditions:

\{\{if\:(?<condition>[^\}]+)\}\}(?<value>.+)\{\{endif\}\} 

This has worked fine for me, and achieved what I want, until I have had to deal with an input with two conditions:

{{if: condition }} content {{endif}} some other content {{if: condition2 }} content2 {{endif}}

In this case, my regular expression picks up the entire string, starting with the {{if}} of the first condition, and ending with the {{endif}} of the second condition, making my applciation not work correctly.

How can I rewrite my regular expression to make this work? Or do I have to achieve it without regular expressions?

EDIT: I should have said the content within the conditions can also have double curly brackets to represent other constructs, and so it's not as simple is just ignoring those!

NOTE: There is also the potential issue of nested conditions, but I don't think I'll have to worry about those!

6
  • "There is also the potential issue of nested conditions, but I don't think I'll have to worry about those!" - that's good, since you can't parse nested structures with (.NET) regular expressions :) Commented May 9, 2011 at 10:26
  • 1
    @Porges, you can perfectly parse those using balancing groups. Commented May 9, 2011 at 10:28
  • @Lucero: Wow, thanks! I've looked for something to match Perls' recursive expressions for a while, and couldn't find anything. Commented May 9, 2011 at 10:35
  • @Porges, I tried them out and they do work really well. But my preference is still to use a proper parser for anything more complex (I wrote an engine for the GOLD Parser which I use for any DSL parsing needs). Commented May 9, 2011 at 10:49
  • Nice. I've never heard of balancing groups - they look very powerful. Love it when I come on here to ask a simple question, and end up learning a bunch of new things at the same time! Commented May 9, 2011 at 10:53

1 Answer 1

2

Your problem is the greedy quantifier for the value group. Use this:

\{\{if\:(?<condition>[^\}]+)\}\}(?<value>.+?)\{\{endif\}\} 
Sign up to request clarification or add additional context in comments.

2 Comments

I'd make two small tweaks: allow single }s in the 'condition', and allow 'value' to be empty: {{if:(?<condition>(?:(?!}}).)*)}}(?<value>.*?){{endif}}.
I thought about the exact same suggestions (plus one regarding the RegexOptions.SingleLine), but then decided against it in order to specifically address the lazy/greedy matching issue here.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.