1

I would like to merge two files, which share some common data. File 1 includes more than file 2. I want to merge the files based on their shared column following the order of file 1 and I want to add 0 to column 5 (AN1) when the variable is not present in file 2.

My files look like this: File 1

CHR BP SNP CM base 20 61098 rs6078030 -0.00024510777 1 20 61795 rs4814683 0 1 20 63231 rs6076506 0.0005026053 1 20 63244 rs6139074 0.00050714752 1 

File 2

CHR BP SNP CM AN1 20 9836704 rs221007 0 1 20 9817032 rs221011 0 1 20 9764069 rs2206484 0 1 20 9639395 rs4816159 0 1 

I want to match them based on column 3 (SNP). I want to keep all the other columns for now.

My desired output would look like this (0 when rsX is not present):

File 3

CHR BP SNP CM base AN1 20 61098 rs6078030 -0.00024510777 1 1 20 61795 rs4814683 0 1 1 20 63231 rs6076506 0.0005026053 1 1 20 63244 rs6139074 0.00050714752 1 1 

I figured that I need to do this in more than 1 step. I tried to use awk to to the first step but it only creates empty files.

awk -F' ' 'NR==FNR{e[$1$2]=1;next};e[$1$2]' file1 file 2 > file 3 awk -F' ' 'NR==FNR{e[$1$2]=1;next};e[1$2]' file2 file 1 > file 3 

I guess the last step will be join file1 file 2 > file 3.

0

1 Answer 1

1
awk 'NR==FNR{ snp[$3]; next } { $6=($3 in snp)?(FNR==1?"AN1":"1"):"0" }1' file2 file1 
2
  • I am trying to figure out what to do when I want to keep the value of AN1 (it is either 0 or 1; and I do not want to write 1 when it is present in file 1, only in the case when the original value was 1) and write 0 when the variant is not present in file 2 (so this would not change from the first version). Do you know how to do this? Commented Mar 5, 2021 at 14:50
  • 1
    yes, but there is no AN1 pair in your file1, and I'm not sure then how I should update my answer to cover that case. better to open a new question for that Commented Mar 5, 2021 at 14:59

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.