Sorting by 2 fields in ksh

Question

file- xyz.161209:/userlogs/logs/reports 355G 195G 150G 57% /home xyz.161209:/userlogs/logs/reports 355G 197G 148G 58% /home xyz.161209:/userlogs/logs/reports 355G 201G 145G 59% /home xyz.161210:/userlogs/logs/reports 355G 218G 129G 63% /home xyz.161210:/userlogs/logs/reports 355G 223G 124G 65% /home xyz.161210:/userlogs/logs/reports 355G 226G 121G 66% /home xyz.161211:/userlogs/logs/reports 355G 210G 136G 61% /home xyz.161211:/userlogs/logs/reports 355G 220G 127G 64% /home xyz.161211:/userlogs/logs/reports 355G 173G 171G 51% /home Result xyz.161209:/userlogs/logs/reports 355G 201G 145G 59% /home xyz.161210:/userlogs/logs/reports 355G 226G 121G 66% /home xyz.161211:/userlogs/logs/reports 355G 220G 127G 64% /home

For every first column that contains sorted date, group by first column and display only the row that contains fifth column with maximum %.

Here's what I have tried,however I can't get the desired output. Anyone can help me?

awk -F, '{if ((a[substr($1,5,6)] == substr($1,5,6)) && (b[substr($5,1,2)] < substr($5,1,2)))b[substr($5,1,2)]=substr($5,1,2);}END{for(i in a){print i,a[i];}}' test.txt

Hi, welcome to StackExchange, what have you tried so far to achieve this ? — steve
– steve, Commented Dec 11, 2016 at 10:50
Worth popping that into original question with the formatting to make it more readable please — steve
– steve, Commented Dec 12, 2016 at 13:19

debal · Accepted Answer · 2016-12-13 00:56:26Z

This is what I came up with, doubt it's the most effective way and would like to see something more efficient. However, it does the job

sort test.txt | awk -F':' '{print $1}' | uniq > unique.txt while read p; do grep $p test.txt | sort -r -k5 | head -1 done < unique.txt rm unique.txt

Explanation:

sort test.txt | awk -F':' '{print $1}' | uniq > unique.txt extract all unique file name from the list

grep $p test.txt | sort -r -k5 | head -1 sort based on the 5th field (which has the %value) on descending order and print only the first line.

+1: Your use of -k in sort led to my answer, as I originally was about to post an almost identical but slightly more bloated answer — hmedia1
– hmedia1, Commented Dec 13, 2016 at 1:16

hmedia1 · Accepted Answer · 2016-12-13 02:36:26Z

This works under ksh for me:

sort -nrk5 -t ' ' test.txt | sort -t '.' -unk2

Given this test file:

 otherfile_.161209:/userlogs/logs/reports 000G 000G 000G 55% /home somefile_.161209:/userlogs/logs/reports 000G 000G 000G 45% /home file71.161209:/userlogs/logs/reports 000G 000G 000G 71% /home file_longer_12.161209:/userlogs/logs/reports 000G 000G 000G 78% /home qwerty_.161210:/userlogs/logs/reports 000G 000G 000G 31% /home xyz.161210:/userlogs/logs/reports 000G 000G 000G 34% /home abcdef.161210:/userlogs/logs/reports 000G 000G 000G 85% /home hellojoe_.161210:/userlogs/logs/reports 000G 000G 000G 45% /home kitchen_.161211:/userlogs/logs/reports 000G 000G 000G 39% /home room.161211:/userlogs/logs/reports 000G 000G 000G 95% /home rooftop_77.161211:/userlogs/logs/reports 000G 000G 000G 12% /home f.161211:/userlogs/logs/reports 000G 000G 000G 30% /home

This is the result:

 file_longer_12.161209:/userlogs/logs/reports 000G 000G 000G 78% /home abcdef.161210:/userlogs/logs/reports 000G 000G 000G 85% /home room.161211:/userlogs/logs/reports 000G 000G 000G 95% /home

Therefore it allows for filenames that:

Have differing lengths
Contain numeric characters

Breakdown:

sort -nrk5 -t ' ' : Initially sort by percentage in column 5
sort -t '.' -unk2 : Print unique results, only counting the date string from the first field (using a . separator)

you could use the -r in the first sort itself instead of piping it to the second one. Should achieve the same result. And upvoted, definitely a better resolution than mine.. — debal
– debal, Commented Dec 13, 2016 at 1:16
However, I see a problem with the -w10 what if the filenames aren't of the same length. The user posted it as xyz for all, but what if there were file names with different lengths, eg qwerty — debal
– debal, Commented Dec 13, 2016 at 1:17
Thanks. Somehow, I got the output not exactly in the highest% for each group, but the lowest%. — Guest
– Guest, Commented Dec 14, 2016 at 1:32

Kamaraj · Accepted Answer · 2016-12-13 02:24:18Z

how about this awk

awk -F"[.: ]" '{if($(NF-1)+0>Arr[$2]+0){Arr[$2]=$(NF-1)+0;Res[$2]=$0}}END{for (i in Res){print Res[i]}}' file xyz.161210:/userlogs/logs/reports 355G 226G 121G 66% /home xyz.161211:/userlogs/logs/reports 355G 220G 127G 64% /home xyz.161209:/userlogs/logs/reports 355G 201G 145G 59% /home awk -F"[.: ]" '{ if($(NF-1)+0>Arr[$2]+0) { Arr[$2]=$(NF-1)+0; Res[$2]=$0 } } END{ for (i in Res) { print Res[i] } }' file

Stack Exchange Network

Sorting by 2 fields in ksh

3 Answers 3

You must log in to answer this question.

Hot Network Questions

Sorting by 2 fields in ksh

3 Answers 3

You must log in to answer this question.

Related

Hot Network Questions