I have 2 lists with files with their md5sum checks and the lists have different paths for the same files.
Example of content in first file with check sums (server.list):
2c03ff18a643a1437ec0cf051b8b7b9d /tmp/fastq1_L001_R1_001.fastq.gz c430f587aba1aa9f4fdf69aeb4526621 /tmp/fastq1_L001_R2_001.fastq.gz/ 6e6bcd84f264233cf7c428c0cfdc0c03 tmp/fastq1_L002_R1_001.fastq.gz Example of content in two file with check sums (downloaded.list):
2c03ff18a643a1437ec0cf051b8b7b9d /home/projects/fastq1_L001_R1_001.fastq.gz c430f587aba1aa9f4fdf69aeb4526621 /home/projects/fastq1_L001_R2_001.fastq.gz 6e6bcd84f264233cf7c428c0cfdc0c03 /home/projects/fastq1_L002_R1_001.fastq.gz When I run the following line, I got the following lines:
awk -F"/" 'FNR==NR{filearray[$1]=$NF; next }!($1 in filearray){printf "%s has a different md5sum\n",$NF}' downloaded.list server.list fastq1_L001_R1_001.fastq.gz has a different md5sum fastq1_L001_R2_001.fastq.gz has a different md5sum fastq1_L002_R2_001.fastq.gz has a different md5sum Why I am getting this message since the first column is the same in both files? Can someone enlighten me on this issue?
Edit:
If I remove the path and leave only the file name, it works just fine.
Edit 2:
As pointed out, there is another possibility of file path form, which does not start with /. In this case, I cannot use / as the field separator.
-F"/"makes it treat that as part of$1.