I've got a file called test.txt and after some manipulations it looks like this:
Metabolism Global and overview maps 01100 Metabolic pathways (1689) 01110 Biosynthesis of secondary metabolites (677) 01120 Microbial metabolism in diverse environments (356) 01200 Carbon metabolism (44) 012111 Carbon metabolism (151) test: test test (44) Now, I want to separate the last column with bracketed numbers so that they are a separate column (using semi colons as my delimiter of choice). I also want to put quotes around all text that are between the bracketed numbers and the ID numbers at the start. Finally, I would like to keep the header rows (the first two in this example).
My code:
sed -r 's/ +/;/' test.txt | awk 'NF{NF-=1};1' | awk -F ";" '{sub($2, "\"&\""); print}' My current output:
"" Global;"and overview" 01100;"Metabolic pathways" 01110;"Biosynthesis of secondary metabolites" 01120;"Microbial metabolism in diverse environments" 01200;"Carbon metabolism" 012111;Carbon (151) test: test test As you can see the "Metabolism" header is gone because it is technically the last value in that row, as well as "maps" in the second row, with a semi colon after "Global" which is not needed. Some rows have bracketed numbers within the text which I should keep but otherwise all rows end with a bracketed value that should be separated into it's only column separated by a semi colon. I also can't get the quotes to go around all of the second column in the last row, whereas the other rows are okay. Finally, I don't know how to separate the bracketed values so that they are a third column.
My desired output (keeping the numbers as a sep column):
"Metabolism" "Global and overview" 01100:"Metabolic pathways";1689 01110:"Biosynthesis of secondary metabolites";677 01120:"Microbial metabolism in diverse environments";356 01200:"Carbon metabolism";44 012111:"Carbon metabolism (151) test: test test";44 using awk GNU version 4.1.3 and sed GNU version 4.2.2. on Windows Linux Sub-system
Carbon metabolismcoming from. The command that you are using gives a syntax error in several versions ofsedso I don't see how it's producing the output that you have. What OS is this and what version ofsedandawk? Are they theGNUversions?sedcommand has a syntax error. It needs to besed -r 's/ +/;/' test.txtwhereas you havesed -r 's/ +/;' test.txtwhich doesn't work with any version ofsed.