3

Possible Duplicate:
what does lazy and greedy means in regexp?

I know that in Regex the question mark after *, + or ? means ungreedy but if I want to match any character, what is the difference between using (.*) or (.*?) ?

Thanks.

EDIT: In my case I want to check a URL. What are the differences between

http://site\.net/(.*?)\.html 

and

http://site\.net/(.*)\.html 

?

3

4 Answers 4

22

.* is greedy, meaning that it will ignore the next delimiter of your regex until it itself is not fulfilled, unless the regex following .* is against the end of the target string.

.*? is ungreedy, meaning that it will proceed to the next delimiter of your regex, if the next is fulfilled. It will continue onto the next delimiter even if itself is still applicable.

Example:

/(.*) dog/ will match "I think your dog bit my dog" and group 1 will be "I think your dog bit my".

/(.*?) dog/ will match "I think your dog bit my dog" and group 1 will be "I think your".

Sign up to request clarification or add additional context in comments.

2 Comments

This doesn't answer the question. He already says he knows that a ? makes the regular expression ungreedy
Yes, I wanted something like an example..
7

If there's nothing following the (.*) in the regular expression then there is absolutely no difference. However, if there is anything following, then there is a difference:

"I went to the shops and then I went home" /(.*) went/ => "[I went to the shops and then I] went" /(.*?) went/ => "[I] went" 

Comments

3

Assume that you got this url:

http://example.net/some/wierd/path.html?returnTo=somedoc.html 

Greedy would match entire line:

http://example.net/some/wierd/path.html?returnTo=somedoc.html 

while non greedy returns:

http://example.net/some/wierd/path.html 

Comments

2

As you already know hat ungreedy behaviour is, I won't explain that again.

It depends on what comes after the (.*?) - That's what's ungreedy behaviour for.

Interestingely enough, this means that a regex in the form /(.*?)/ doesn't make much sense - because how can you be lazy, if you match everyting anyway?

If you try to create this regex in e.g. Regexr, it won't even compile, because it's nonsense.

Only if you put something behind the group, your regex will make any kind of sense. I'm not sure if all rege engines do the same as Regexr here and deny to accept that regex.

So, if you want to match anything until a certain character, you'd have to put that specific character after your ungreedy-anything-group. This way, everything before that particular character is matched.

To bring it to a conclusion; it doesn't make any difference, IF there isn't something AFTER the group.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.