How to divide values from one column by another and print results in a new column?

Question

I am relatively new to awk so I have a simple question about doing division and print the results in a new column. For example:

head data 1 13273 . G C 563 5 . 25 128 1 202259 . G T 675 8 . 12 130 1 598934 . C C 756 9 . 17 231 1 634112 . T C 125 1 . 32 89 1 779762 . G A 675 5 . 28 187

I would like to divide column 9 by column 10 and print the results in a new column 11, preferably sort the new results from high to low. For example:

1 634112 . T C 125 1 . 32 89 0.360 1 13273 . G C 563 5 . 25 128 0.195 1 779762 . G A 675 5 . 28 187 0.150 1 202259 . G T 675 8 . 12 130 0.092 1 598934 . C C 756 9 . 17 231 0.074

I only know how to do it in R, but I wanted to learn how we can do it in awk. Thanks!

jas · Accepted Answer · 2017-06-15 20:48:11Z

Awk is quite expressive with respect to the first requirement. If you want a column 11, you can just invent it and set it equal to the result of dividing column 9 by column 10.

It's possible to do the sort in awk, but it's a bit of a pain so easier just to pipe to sort. The column command makes it prettier, nothing more than that.

$ awk '{$11 = $9 / $10}1' file | sort -nr -k 11 | column -t 1 634112 . T C 125 1 . 32 89 0.359551 1 13273 . G C 563 5 . 25 128 0.195312 1 779762 . G A 675 5 . 28 187 0.149733 1 202259 . G T 675 8 . 12 130 0.0923077 1 598934 . C C 756 9 . 17 231 0.0735931

If your output needs to be tab separated, you can set the OFS variable (and forget about the column command):

$ awk -v OFS='\t' '{$11 = $9 / $10}1' file | sort -nr -k 11 1 634112 . T C 125 1 . 32 89 0.359551 1 13273 . G C 563 5 . 25 128 0.195312 1 779762 . G A 675 5 . 28 187 0.149733 1 202259 . G T 675 8 . 12 130 0.0923077 1 598934 . C C 756 9 . 17 231 0.0735931

Finally, you can use sprintf to format that last column as in your sample output:

$ awk -v OFS='\t' '{$11 = sprintf("%.3f", $9 / $10)}1' file | sort -nr -k 11 1 634112 . T C 125 1 . 32 89 0.360 1 13273 . G C 563 5 . 25 128 0.195 1 779762 . G A 675 5 . 28 187 0.150 1 202259 . G T 675 8 . 12 130 0.092 1 598934 . C C 756 9 . 17 231 0.074

UPDATE:

As Ed Morton shows in his answer, the ternary operator ?: can be used to protect against dividing by zero. Here I've put "UND" in column 11 to indicate "undefined", but of course you can just leave it blank or put some different value.

$ awk -v OFS='\t' '{$11 = ($10 != 0) ? sprintf("%.3f", $9 / $10) : "UND"}1' file | sort -nr -k 11 1 634112 . T C 125 1 . 32 89 0.360 1 13273 . G C 563 5 . 25 128 0.195 1 779762 . G A 675 5 . 28 187 0.150 1 202259 . G T 675 8 . 12 130 0.092 1 598934 . C C 756 9 . 17 0 UND

At some point you might decide that the awk program is getting complicated enough that it's better off in its own file with an emphasis more on readability than compactness.

$ cat div.awk file BEGIN { OFS="\t"} { if ($10 != 0) { quotient = $9 / $10 $11 = sprintf("%.3f", quotient) } else { $11 = "UND" } print } $ awk -f div.awk file | sort -nr -k 11 1 634112 . T C 125 1 . 32 89 0.360 1 13273 . G C 563 5 . 25 128 0.195 1 779762 . G A 675 5 . 28 187 0.150 1 202259 . G T 675 8 . 12 130 0.092 1 598934 . C C 756 9 . 17 0 UND

Thanks for your code! I wonder what we should do if $10 is zero? How to prevent getting an error?
You're welcome! See update for one way of checking for zero.

Ed Morton · Accepted Answer · 2017-06-14 21:26:03Z

With GNU awk for sorted_in:

$ cat tst.awk { a[NR]=$0; v[NR]=$9/$10 } END { PROCINFO["sorted_in"]="@val_num_desc" for (i in v) { print a[i] "\t" v[i] } } $ awk -f tst.awk file 1 634112 . T C 125 1 . 32 89 0.359551 1 13273 . G C 563 5 . 25 128 0.195312 1 779762 . G A 675 5 . 28 187 0.149733 1 202259 . G T 675 8 . 12 130 0.0923077 1 598934 . C C 756 9 . 17 231 0.0735931

Change v[NR]=$9/$10 to v[NR]=($10==0 ? 0 : $9/$10) or similar to protect against divide-by-zero if $10 can be zero.

Collectives™ on Stack Overflow

How to divide values from one column by another and print results in a new column?

2 Answers 2

3 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Related