0

I have a file fith big number of colums like

ASN 1 | R ASN 1 | 0.000 +/- 0.000 | -0.000 +/- 0.000 | 0.045 +/- 0.034 | -0.045 +/- 0.034 | 0.000 +/- 0.000 | 0.000 +/- 0.001 HID 2 | R HID 2 | 0.000 +/- 0.000 | -0.000 +/- 0.000 | 0.001 +/- 0.002 | -0.001 +/- 0.002 | 0.000 +/- 0.000 | 0.000 +/- 0.001 PRO 3 | R PRO 3 | 0.000 +/- 0.000 | -0.000 +/- 0.000 | 0.001 +/- 0.004 | -0.001 +/- 0.004 | 0.000 +/- 0.000 | -0.000 +/- 0.001 LYS 4 | R LYS 4 | 0.000 +/- 0.000 | -0.000 +/- 0.000 | 0.182 +/- 0.073 | -0.176 +/- 0.072 | 0.000 +/- 0.000 | 0.005 +/- 0.003 MET 5 | R MET 5 | 0.000 +/- 0.000 | -0.000 +/- 0.000 | -0.004 +/- 0.004 | 0.006 +/- 0.004 | 0.000 +/- 0.000 | 0.002 +/- 0.001 

from this file I need to extract of only first and last column removing from the last column error value (+/- value ) to obtain smth like: ASN 1 0.000

its strange that below command works good with the exemption that it could not remove error from the last column

gawk -F'[|]' '{print $1, $NF}' $file ASN 1 0.000 +/- 0.001 HID 2 -0.000 +/- 0.001 PRO 3 -0.000 +/- 0.001 LYS 4 0.000 +/- 0.001 MET 5 -0.000 +/- 0.001 GLU 6 -0.000 +/- 0.001 MET 7 0.000 +/- 0.001 ILE 8 0.000 +/- 0.001 LEU 9 0.001 +/- 0.001 

alternatively when I replace it with

gawk -F'[|,+/-]' '{print $1, $(NF-1)}' $file 

it didn't replace column before last column (value) but did subtraction -1 from the last (error) column:

ASN 1 -0.999 HID 2 -0.999 PRO 3 -0.999 LYS 4 -0.997 

what should I correct here to fix the script ?

2 Answers 2

1

Your regex for field separator is wrong. Use like this:

gawk -F'\\||\\+/-' 'NF>1{print $1, $(NF-1)}' file ASN 1 0.000 HID 2 0.000 PRO 3 -0.000 LYS 4 0.005 MET 5 0.002 

i.e. use double escaping for regex meta characters like | or +.

Code Demo

Sign up to request clarification or add additional context in comments.

7 Comments

produced error OMP_OR2W1_linalol.dat FNR=5) fatal: attempt to access field -1
With your sample data, what I pasted here is the actual output I got from gawk 4.1.1 btw using my BSD awk also this works
its real works only when I use $NF-1 but in that case it just substract ONE from the LAST VALUE :)
@user3470313 clearly you have empty lines in your file. Just prefix the action block with NF so it only executes on lines that aren't empty.
As suggested by @EdMorton, I added a safeguard condition NF> in the answer. Try again. btw did you check demo link?
|
0

When you use -F'[|]', you are stating that | is a field separator. Using -F[|+/-] means you're using any of these characters as a field separator: |, +, /, or -.

You have two choices:

  • Use spaces, but then understand that you need to calculate your columns a bit differently since +/- is now a column. I print columns 1, 2, and the third from the last.

For example:

$ awk '{printf ("%-5.5s %2d %10.3f\n", $1, $2, $(NF - 2))}' test.txt ASN 1 0.001 HID 2 0.001 PRO 3 0.001 LYS 4 0.003 MET 5 0.001 
  • Or, you can use a fancier regular expression that says you want to separate fields via *\| * or *+/- *. Note I include the spaces in my regular expression field separator. This way, spaces are stripped from my columns:

Note my regular expression:

$ awk -F' *\| *| *\+/- *' \ '{printf ("%-5.5s %2d %10.3f\n", $1, $2, $NF)}' file ASN 1 0.001 HID 2 0.001 PRO 3 0.001 LYS 4 0.003 MET 5 0.001 

This works with standard awk on BSD and nawk on Solaris. gawk might do things a bit differently.

4 Comments

its again produce error if I use $(NF - 1) and alternatively subtract 1 from the value of last columm if I use $NF - 1. It seems problem with my AWK isn't it?
Are you cutting and pasting these commands from my answer? The first one works with no errors on GNU Awk 3.1.5 on RHEL. The second one gives me a warning that | and + will be treated as plain characters and not as special regular expressions. Otherwise, it works.
What machine are you using? What is your OS? Maybe you have an OS like Solaris where things don't work as they would on BSD or Linux.
The OS doesn't matter. He's using gawk, it works just fine. The OP needs to tell us more about his input file and what exactly the problems/errors are he's seeing.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.