7

Possible Duplicate:
Linux tools to treat files as sets and perform set operations on them

I have two data sets, A and B. The format for each data set is one number per line. For instance,

12345 23456 67891 2345900 12345 

Some of the data in A are not included in data set B. How to list all of these data in A, and how to list all of those data shared by A and B. How can I do that using Linux/UNIX commands?

1

2 Answers 2

16

Use the comm command.

If you lists are in files listA and listB:

comm listA listB 

By default, comm will return 3 columns. Items only in listA, items only in listB, and items common to both lists.

You can suppress individual columns, with a -1, -2, or -3 arg.

2
  • 8
    The answer assumes listA and listB are already sorted. A more general solution: comm <(sort listA) <(sort listB) Commented Sep 19, 2014 at 7:33
  • Very simple solution. Is the comm command deployed in all linux distro? Commented Jul 6, 2015 at 8:07
1

This will give you the unique items that exist in A but not in B:

cat A|perl -ne '$z=$_;chomp($z);$y=`grep $z B`;if ($y== "") {print "\n$z";}'|sort -u 

This will give you the list of common items in both A and B:

cat A |xargs -i grep {} B|sort -u 

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.