98

I was wondering how to match a line not containing a specific word using Python-style Regex (Just use Regex, not involve Python functions)?

Example:

PART ONE OVERVIEW 1 Chapter 1 Introduction 3 

I want to match lines that do not contain the word "PART"?

4
  • What are you going to be using to do the matching? Commented Jun 7, 2011 at 0:13
  • Does PART always appear at the start? Commented Jun 7, 2011 at 0:17
  • @David: Just clarify the example. Commented Jun 7, 2011 at 0:36
  • 1
    And the correct answer is ^(?!.*PART).*$. Or ^(?!.*\bPART\b).*$ if a whole word check is necessary. Or if entire string match is not necessary, remove .*$ from both of the above. Commented Nov 7, 2016 at 15:10

1 Answer 1

189

This should work:

/^((?!PART).)*$/ 

Edit (by request): How this works

The (?!...) syntax is a negative lookahead, which I've always found tough to explain. Basically, it means "whatever follows this point must not match the regular expression /PART/." The site I've linked explains this far better than I can, but I'll try to break this down:

^ #Start matching from the beginning of the string. (?!PART) #This position must not be followed by the string "PART". . #Matches any character except line breaks (it will include those in single-line mode). $ #Match all the way until the end of the string. 

The ((?!xxx).)* idiom is probably hardest to understand. As we saw, (?!PART) looks at the string ahead and says that whatever comes next can't match the subpattern /PART/. So what we're doing with ((?!xxx).)* is going through the string letter by letter and applying the rule to all of them. Each character can be anything, but if you take that character and the next few characters after it, you'd better not get the word PART.

The ^ and $ anchors are there to demand that the rule be applied to the entire string, from beginning to end. Without those anchors, any piece of the string that didn't begin with PART would be a match. Even PART itself would have matches in it, because (for example) the letter A isn't followed by the exact string PART.

Since we do have ^ and $, if PART were anywhere in the string, one of the characters would match (?=PART). and the overall match would fail. Hope that's clear enough to be helpful.

Sign up to request clarification or add additional context in comments.

5 Comments

For me, I needed to search for lines that don't have a specific word -word1 before another specific word -word2. So I used negative lookbehind like that: (?!<word1)word2 Took me a long while to make it work.
Thanks a lot for differentiating the search for beginning of the line and search for the whole line. you saved my day.
I tried to use this to get rid of strings like 'Sensor_5_Wind' but /^((?!Wind).)*$/ doesn't work?
@KillerSnail - That's not enough information for me to help you; try asking a separate question.
If you are using grep then use -P option. e.g. grep -P '(?!do not include this string)'

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.