The standard usage of grep is to return lines that match a pattern.
If a line can contain several matches of the pattern, how can I count each match individually, not the total number of matches?
The standard usage of grep is to return lines that match a pattern.
If a line can contain several matches of the pattern, how can I count each match individually, not the total number of matches?
The grep command has a -c option that counts the number of lines matched by a pattern. Since the standard usage of grep is to return lines that match a pattern, this solves the task "count the number of matches".
If a line can contain several matches of the pattern, you may use grep with its non-standard -o option if you want to count each match individually. This isolates each match on a line of its own. You may then count the number of matches by passing the result through wc -l. This uses wc to do the actual counting, not grep. However, you could cheat and use grep -c . in place of wc -l to count the number of non-empty lines returned from the first grep. Since that is a bit of a hack, and since wc -l does literally what we want, we'll use wc in the examples below.
See the manuals for grep and wc on your system.
Example: The number of lines matching the pattern G in file:
$ grep -c -e G file 7 Example: The number of matches in the same file, but counting each match individually:
$ grep -o -e G file | wc -l 18 grep -o only prints the non-empty matches. For instance seq 10 | grep -c '^' prints 10 but seq 10 | grep -o '^' | wc -l prints 0. seq 10 | grep -c '7*' prints 10, but seq 10 | grep -o '7*' | wc -l prints 1. Using awk:
$ awk '{a += gsub(/pat/,"&"); } END{print a}' file Or
$ awk '{for(i=1;i<=NF;i++)if ($i ~ /pat/) ++a}END{print a}' The command is slightly changed for overlapping matching taken from this answer.
$ echo abababa | awk '{ while (a=index($0,"aba")) {++count; $0=substr($0,a+1)}}END{print count}' With perl, you could do:
perl -lsne '$count++ while m{$regex}g; END{print +$count}' -- -regex='perl regex' That has the advantage of also counting empty matches such as:
$ seq 10 perl -lsne '$count++ while m{$regex}g; END{print +$count}' -- -regex='\b' 20 (20 word boundaries in the contents of the lines of the output of seq 10).
With perl regexps, you can also handle some cases of overlapping matches by using look-around operators:
$ echo abababa | perl -lsne '$count++ while m{$regex}g; END{print +$count}' -- -regex='aba' 2 $ echo abababa | perl -lsne '$count++ while m{$regex}g; END{print +$count}' -- -regex='(?=aba)' 3 Which instead of matching on occurrences of aba, matches on the positions within the line where aba can be seen ahead.