2

I have a file containing some infos ( A.txt ; sep="\t" ; first column is "\t") :

 Well Fluor Target Content Sample Cq SQ A01 Cy5 EC Unkn-01 205920777.1 25.714557922167 NaN A01 FAM Covid Unkn-01 205920777.1 21.6541150578409 NaN A02 Cy5 EC Unkn-09 neg5 25.5068289526473 NaN A02 FAM Covid Unkn-09 neg5 NaN NaN A07 Cy5 EC Unkn-49 NaN NaN A07 FAM Covid Unkn-49 NaN NaN 

And I have a template (B.txt;sep=",") :

kit Software Version = Date And Time of Export = Experiment Name = Instrument Software Version = Instrument Type = CFX Instrument Serial Number = Run Start Date = Run End Date = Run Operator = Batch Status = VALID Method = Novaprime Date And Time of Export,Batch ID,Sample Name,Well,Sample Type,Status,Interpretive Result,Action*,Curve analysis ,,,,,,,,,, *reporting. 

And I want to put the infos of A.txt in C.txt by using the template B.txt. C.txt :

kit Software Version = Date And Time of Export = Experiment Name = Instrument Software Version = Instrument Type = CFX Instrument Serial Number = Run Start Date = Run End Date = Run Operator = Batch Status = VALID Method = Novaprime Date And Time of Export,Batch ID,Sample Name,Well,Sample Type,Status,Interpretive Result,Action*,Curve analysis ,,205920777.1,A01,Unkn-01 ,,neg5,A02,Unkn-09 ,,,,,,,,,, *reporting. 

The trick is to print only line in A.txt that aren't empty for column 5. I've try things like :

awk 'NR==FNR{a[$5]=$1;next}{print $1,$2,a[$1]} ' A.txt B.txt > C.txt 

But it can't works because B.txt don't have a similar key. And the difference of separator is also an issue. Can someone have an idea?

Thanks

1
  • Do you need one C.txt file with a line for each entry in A.txt? Or many C.txt files, one for each entry? Your example has only 2, why? Is there something special about Unkn-01 and Unkn-09? Please clarify. Commented Jul 7, 2020 at 10:49

1 Answer 1

1

Assuming the first column of your file is empty as you say, you need to shift everything one to the left. When you talk about the 5th field, that's actually the 6th. In any case, the simplest approach I can think of is to first modify your A.txt file so it has a format you can use:

$ awk -F'\t' -v OFS="," '(NR>1 && $6!="NaN"){print ",",$6,$2,$5}' A.txt | sort | uniq ,,205920777.1,A01,Unkn-01 ,,neg5,A02,Unkn-09 

That should give you the strings you want to insert into your C.txt. So, to add them, you can do something inelegant like this:

( head -n 13 B.txt awk -F'\t' -v OFS="," '(NR>1 && $6!="NaN"){print ",",$6,$2,$5}' A.txt | sort | uniq tail -n+14 B.txt ) > C.txt 

Which produces:

$ cat C.txt kit Software Version = Date And Time of Export = Experiment Name = Instrument Software Version = Instrument Type = CFX Instrument Serial Number = Run Start Date = Run End Date = Run Operator = Batch Status = VALID Method = Novaprime Date And Time of Export,Batch ID,Sample Name,Well,Sample Type,Status,Interpretive Result,Action*,Curve analysis ,,205920777.1,A01,Unkn-01 ,,neg5,A02,Unkn-09 ,,,,,,,,,, *reporting. 
2
  • Thanks for your answer. It helps a lot ! Just one step missing, because I only want one line per sample, not each occurrence. Commented Jul 7, 2020 at 11:16
  • @nstatam see updated answer. Commented Jul 7, 2020 at 11:19

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.