1

I'm hoping to write a bash script to grep files matching several strings.

I've found the 'hard-wired' solution here: (Source: Find files containing multiple strings):

find . -type f -exec grep -l 'string1' {} \; | xargs grep -l 'string2' | xargs grep -l 'string3' | xargs grep -l 'string4'

Would return all files containing string1 && string2 && string3 && string4.

How can I convert this into a bash shell function fn that takes in an arbitrary number of arguments? I'd like fn string1 string2 string3 string4 ... to give me identical results. Persumably this would involve looping through arguments and piping results to successive commands xargs grep -l ${!i}

1 Answer 1

1

If your grep supports -P (PCRE) option, how about:

# grep files containing arbitrary number of strings fn() { local dir=$1 # directory to search shift local -a patterns=("$@") # list of target strings local i # local variable local pat="(?s)" # single mode makes dot match a newline for i in "${patterns[@]}"; do pat+="$(printf "(?=.*\\\b%s\\\b)" "$i")" done find . -type f -exec grep -zlP "$pat" {} \; } # example of usage fn . string1 string2 string3 

If the passed word list is word1 word2, it generates a regex pattern "(?s)(?=.*\bword1\b)(?=.*\bword2\b) which matches a file containing both word1 and word2 in any order.

  • (?s) specifies a "single mode" making a dot match any characters including a newline.
  • -z option to grep sets the input record separator to a null character. Then the whole file is treated as a single line.

If grep -P is not available, here is an alternative using a loop:

fn() { local dir=$1 # directory to search shift local -a patterns=("$@") # list of target strings local i f flag # local variables while IFS= read -rd "" f; do # loop over the files fed by "find" fail=0 # flag to indicate match fails for i in "${patterns[@]}"; do # loop over the target strings grep -q "$i" "$f" || { fail=1; break; } # if not matched, set the flag and exit the loop done (( fail == 0 )) && echo "$f" # if all matched, print the filename done < <(find . -type f -print0) } 
Sign up to request clarification or add additional context in comments.

4 Comments

OP is not looking for lines containing all the patterns. They want files which match all the patterns, not necessarily on the same line. So your first solution doesn't work.
@rici oh! that's right. I've fixed the code to match words on different lines. Thank you so much for pointing it out.
Thanks! Note: \\\b in the first version means that it only greps whole word matches only.
Thank you for the feedback. That's correct. If you want to make e.g str match not only str but also string, just drop \\\bs. BR.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.