2

I am familiar with comparing numbers in a row using bash, but what if I wanted to compare columns? Like if i had a file with

4 2 5 7 6 1 3 8 

And I want to find the largest out of each column

6 2 5 8 

How can I do this using awk?

4
  • Is the number of columns fixed? Commented Mar 23, 2014 at 19:44
  • There's surely a bash/awk solution for your problem. That being said, why would you tackle such a problem in bash in the first place?! It's just the wrong language for anything but simple 10-line scripts. Sure, you can write more complex code (I did), but it's so much harder than using Python, Perl, TCL, Lua, etc. Commented Mar 23, 2014 at 19:52
  • @unxnut no, the number of columns is not fixed Commented Mar 23, 2014 at 19:56
  • Is it the same number of fields in each record? Commented Mar 23, 2014 at 19:58

2 Answers 2

4

Assuming that your data is in a file called "data":

$ awk '{for(j=1;j<=NF;++j){max[j]=(max[j]>$j)?max[j]:$j};mNF=mNF>NF?mNF:NF} END{for(j=1;j<=mNF;++j)printf " " max[j]; print "" }' data 6 2 5 8 

Or, with different output formatting:

$ awk '{for(j=1;j<=NF;++j){max[j]=(max[j]>$j)?max[j]:$j};mNF=mNF>NF?mNF:NF} END{for(j=1;j<=mNF;++j)print "max of column " j "=" max[j]}' data max of column 1=6 max of column 2=2 max of column 3=5 max of column 4=8 

The body of the above awk program consists of:

{ for(j=1;j<=NF;++j) {max[j]=(max[j]>$j)?max[j]:$j};mNF=mNF>NF?mNF:NF } 

This loop is run for every line in the input file. The loop runs through every column (field) in that input line from 1 to the number of fields (NF). For each column, it checks to see if the current value is greater than the previous maximum. If it is, it updates the value in the max array.

Sign up to request clarification or add additional context in comments.

5 Comments

+1: OP can use END{for(j=1;j<=NF;++j)printf "%s%s",max[j],(j==NF?"\n":FS)}' to print according to the desired output.
@jaypal, or even END {for (j=1;j<=NF;j++) printf "%s ",max[j]; print ""} -- a bit less tricky
@glennjackman Absolutely .. I am just more inclined more towards job security! ;).
you need fix it with no-fix columns. awk '{for(j=1;j<=NF;++j)max[j]=(max[j]>$j)?max[j]:$j;mNF=mNF>NF?mNF:NF} END{for(j=1;j<=mNF;++j) printf max[j] FS; print RS }' file
@BMW Thanks! That is a nice generalization. I added it to the answer.
2

Bash solution using arrays:

max=(); while read -a line ; do for ((i=0; i<${#line[@]} ; i++)) ; do ((line[i]>max[i])) && max[i]=${line[i]} done done < input echo "${max[@]}" 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.