1

I am trying to grep a series of patterns that are, one per line, in a text file (list.txt), to find how many matches there are of each pattern in the target file (file1). The target file looks like this:

$ cat file1 2346 TGCA 2346 TGCA 7721 GTAC 7721 GTAC 7721 CTAC 

And I need counts of each numerical pattern (2346 and 2271).

I have this script that works if you provide a list of the patterns in quotes:

 $ for p in '7721' '2346'; do printf '%s = ' "$p"; grep -c "$p" file1; done 7721 = 3 2346 = 2 

What I would like to do is search for all patterns in list.txt:

$ cat list.txt 7721 2346 6555 25425 22 125 .... 19222 

How can I convert my script above to look in the list.txt and search for each pattern, and return the same output as above (pattern = count) e.g.:

2346 = 2 7721 = 3 .... 19222 = 6 
3
  • Does it matter that one of those 7721 lines is different then the other two? Commented Apr 7, 2015 at 14:46
  • 2
    I don't see how this is a duplicate of that question? There are plenty of questions of this same form that it could reasonably be a duplicate of but not that one. Commented Apr 7, 2015 at 14:50
  • it doesn't matter. Infact, the string after the ID may vary across many matches. Commented Apr 7, 2015 at 14:58

1 Answer 1

2

try this awk oneliner:

 awk 'NR==FNR{p[$0];next}$1 in p{p[$1]++}END{for(x in p)print x, p[x]}' list.txt file 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.