Join lines based upon the first field
Let's say I have a file with data like below:
ENST000001.1 + 67208778 67210057 ENST000001.1 + 67208778 67210768 ENST000001.1 + 67208778 67208882 ENST000002.5 + 67208778 67213982 ENST000003.1 - 57463571 57463801 ENST000003.1 - 57476352 57476463 ENST000003.1 - 57476817 57476945
I want to join some lines based on the first field and follow certain pattern for joining the lines. Expected output is:
ENST000001.1 + 67208778_67210057 67208778_67210768 67208778_67208882 ENST000002.5 + 67208778_67213982 ENST000003.1 - 57463571_57463801 57476352_57476463 57476817_57476945
I actually have two different solutions to achieve the same. Below they are one in Awk and one in Perl:
awk '{a[$1" "$2]=a[$1" "$2]" "$3" "$4;} END{ for(i in a)print i,a[i] }' your_file In Perl:
perl -F -lane '$H{$F[0]." ".$F[1]}= $H{$F[0]." ".$F[1]}." ".$F[2]."_".$F[3]; if(eof){ foreach(keys %H){print $_,$H{$_}} }' your_file
0 comments: