find the min and max value in two columns

Question

I have some data which looks like:

sampleA ATGC 10 100 sampleA ATGC 120 230 sampleA ATGC 200 110

I want to print the min and max using the values in both column 3 and 4. So my output should look like:

sampleA 10 230

Thanks in advance

Do you want to print the min value from column 3 only and the max value from column 4 only? Or if column 3 was: 1000, 120, 200 and column 4 was 100, 230, 10 would you want your results to be: sampleA 1000 10? — jesse_b
– jesse_b, Commented Aug 9, 2017 at 12:18
The max or min could be in column 3 or 4. Because this is DNA ORF information some reads are in reverse like the last line in the above example. — AudileF
– AudileF, Commented Aug 9, 2017 at 14:14
So in that case I believe both answers below wont work. NVM just noticed AFSHIN edited his answer. — jesse_b
– jesse_b, Commented Aug 9, 2017 at 14:30
It appears that columns 1 and 2 have absolutely nothing to do with the output. Are the column 1 values identical in all rows? If not, I don't see how you can even define what the first field of output should be. — Wildcard
– Wildcard, Commented Aug 9, 2017 at 20:48

RomanPerekhrest · Accepted Answer · 2017-08-09 20:34:11Z

Short awk solution:

awk '{ a[++c]=$3; a[++c]=$4 }END{ asort(a); print $1,a[1],a[length(a)] }' file

The output:

sampleA 10 230

Short datamash solution (for separate min/max calculation within 3rd/4th columns):

datamash -W -g1 min 3 max 4 < file

-g1 - group records by 1st column value
min 3 - get minimum value on 3rd column
max 4 - get maximum value on 4rd column

The output:

sampleA 10 230

I'm not familiar with the datamash! does it can find the min/max within 2columns as OP is clarified? — αғsнιη
– αғsнιη, Commented Aug 9, 2017 at 17:52

αғsнιη · Accepted Answer · 2017-08-09 17:42:05Z

Using awk:

awk 'BEGIN{getline; min=$3;max=$4} {(min>$3)?min=$3:"";(max>$4)?"":max=$4} END{print min, max}' infile.txt

The output is:

10 230

But I guess you are looking for something like below to find min/max within 2Columns not min in 3rd Column and max in 4th Column only as above is finding.

Sample Input:

sampleA ATGC 10 100 sampleA ATGC 300 2 sampleA ATGC 200 1100 sampleA ATGC 2301 9 sampleA ATGC 12345 15 sampleA ATGC 235 7

The command:

awk 'BEGIN{getline;min=max=$3; ($4>$3)?max=$4:min=$4} { ($3>$4 && min>$4)?min=$4:((min>$3)?min=$3:""); ($3>$4 && $3>max)?max=$3:((max<$4)?max=$4:""); } END{print min, max}' infile.txt

The output would be:

2 12345

user245787 · Accepted Answer · 2017-08-09 21:07:43Z

NF == 4 { if (++totalSamples == 1) { sampleName = $1 minValue = $3; maxValue = $3; } else { if ($3 < minValue) minValue = $3 else if ($3 > maxValue) maxValue = $3 } if ($4 < minValue) minValue = $4 else if ($4 > maxValue) maxValue = $4 } END { if (totalSamples) printf("%s %d %d\n", sampleName, minValue, maxValue) }

Stack Exchange Network

find the min and max value in two columns

3 Answers 3

You must log in to answer this question.

Linked

Hot Network Questions

find the min and max value in two columns

3 Answers 3

You must log in to answer this question.

Linked

Related

Hot Network Questions