Outputting common lines from 2 files and uncommon lines from both the files in one output file

Question

I have 2 text files. Lets name them file1.txt and file2.txt

file1.txt is as follows

chr10 181144 225933 chr10 181243 225933 chr10 181500 225933 chr10 226069 255828 chr10 255989 267134 chr10 255989 282777 chr10 267297 282777 chr10 282856 283524 chr10 283618 285377 chr10 285466 285995

file2.txt is as follows

chr10 181144 225933 chr10 181243 225933 chr10 181500 225933 chr10 255989 282777 chr10 267297 282777 chr10 282856 283524 chr10 375542 387138 chr10 386930 387138 chr10 387270 390748 chr10 390859 390938 chr10 391051 394580 chr10 394703 395270

What I want to output in a single file is

All the common lines between file1 and file2
All the lines which are in file1 but are not common to both
All the lines which are in file2 but are not common to both.

I wrote a Perl script to do this but I am pretty sure there must be a command line or an easier way to do it.

sort -u file1.txt file2.txt is the obvious answer here unless you want the lines in the output to be in that particular order... — don_crissti
– don_crissti, Commented Feb 25, 2017 at 13:38

steeldriver · Accepted Answer · 2014-09-12 19:09:26Z

Lines common to both files:

comm -12 file1.txt file2.txt > results.txt

Add lines unique to file1.txt:

comm -23 file1.txt file2.txt >> results.txt

Add lines unique to file2.txt:

comm -13 file1.txt file2.txt >> results.txt

If the files are not already sorted, you must do so beforehand e.g. if your shell supports process substitution

comm -12 <(sort file1.txt) <(sort file2.txt)

etc.

You must sorted two files first.

cuonglm
– cuonglm

2014-09-12 18:23:58 +00:00
Commented Sep 12, 2014 at 18:23 — cuonglm
– cuonglm, Commented Sep 12, 2014 at 18:23
We also need -u option to prevent duplicated lines.

cuonglm
– cuonglm

2014-09-12 19:14:39 +00:00
Commented Sep 12, 2014 at 19:14 — cuonglm
– cuonglm, Commented Sep 12, 2014 at 19:14

cuonglm · Accepted Answer · 2014-09-12 19:43:23Z

There is a comm command to do this job. But you can do it by combining other standard tools like grep, sort, uniq, join. Here's a solution use grep, with equivalent using comm.

Lines common to both files:

grep -xF -f file1 file2 comm -12 <(sort -u file1) <(sort -u file2)

Lines only in file1:

grep -vxF -f file2 file1 comm -23 <(sort -u file1) <(sort -u file2)

Lines only in file2:

grep -vxF -f file1 file2 comm -13 <(sort -u file1) <(sort -u file2)

Stack Exchange Network

Outputting common lines from 2 files and uncommon lines from both the files in one output file

2 Answers 2

You must log in to answer this question.

Linked

Hot Network Questions

Outputting common lines from 2 files and uncommon lines from both the files in one output file

2 Answers 2

You must log in to answer this question.

Linked

Related

Hot Network Questions