0
file- xyz.161209:/userlogs/logs/reports 355G 195G 150G 57% /home xyz.161209:/userlogs/logs/reports 355G 197G 148G 58% /home xyz.161209:/userlogs/logs/reports 355G 201G 145G 59% /home xyz.161210:/userlogs/logs/reports 355G 218G 129G 63% /home xyz.161210:/userlogs/logs/reports 355G 223G 124G 65% /home xyz.161210:/userlogs/logs/reports 355G 226G 121G 66% /home xyz.161211:/userlogs/logs/reports 355G 210G 136G 61% /home xyz.161211:/userlogs/logs/reports 355G 220G 127G 64% /home xyz.161211:/userlogs/logs/reports 355G 173G 171G 51% /home Result xyz.161209:/userlogs/logs/reports 355G 201G 145G 59% /home xyz.161210:/userlogs/logs/reports 355G 226G 121G 66% /home xyz.161211:/userlogs/logs/reports 355G 220G 127G 64% /home 

For every first column that contains sorted date, group by first column and display only the row that contains fifth column with maximum %.

Here's what I have tried,however I can't get the desired output. Anyone can help me?

awk -F, '{if ((a[substr($1,5,6)] == substr($1,5,6)) && (b[substr($5,1,2)] < substr($5,1,2)))b[substr($5,1,2)]=substr($5,1,2);}END{for(i in a){print i,a[i];}}' test.txt 
3
  • Hi, welcome to StackExchange, what have you tried so far to achieve this ? Commented Dec 11, 2016 at 10:50
  • No. Any examples or help I can get here? Commented Dec 12, 2016 at 5:43
  • Worth popping that into original question with the formatting to make it more readable please Commented Dec 12, 2016 at 13:19

3 Answers 3

1

This is what I came up with, doubt it's the most effective way and would like to see something more efficient. However, it does the job

sort test.txt | awk -F':' '{print $1}' | uniq > unique.txt while read p; do grep $p test.txt | sort -r -k5 | head -1 done < unique.txt rm unique.txt 

Explanation:

sort test.txt | awk -F':' '{print $1}' | uniq > unique.txt extract all unique file name from the list

grep $p test.txt | sort -r -k5 | head -1 sort based on the 5th field (which has the %value) on descending order and print only the first line.

1
  • +1: Your use of -k in sort led to my answer, as I originally was about to post an almost identical but slightly more bloated answer Commented Dec 13, 2016 at 1:16
1

This works under ksh for me:

sort -nrk5 -t ' ' test.txt | sort -t '.' -unk2 

Given this test file:

  •  otherfile_.161209:/userlogs/logs/reports 000G 000G 000G 55% /home somefile_.161209:/userlogs/logs/reports 000G 000G 000G 45% /home file71.161209:/userlogs/logs/reports 000G 000G 000G 71% /home file_longer_12.161209:/userlogs/logs/reports 000G 000G 000G 78% /home qwerty_.161210:/userlogs/logs/reports 000G 000G 000G 31% /home xyz.161210:/userlogs/logs/reports 000G 000G 000G 34% /home abcdef.161210:/userlogs/logs/reports 000G 000G 000G 85% /home hellojoe_.161210:/userlogs/logs/reports 000G 000G 000G 45% /home kitchen_.161211:/userlogs/logs/reports 000G 000G 000G 39% /home room.161211:/userlogs/logs/reports 000G 000G 000G 95% /home rooftop_77.161211:/userlogs/logs/reports 000G 000G 000G 12% /home f.161211:/userlogs/logs/reports 000G 000G 000G 30% /home

This is the result:

  •  file_longer_12.161209:/userlogs/logs/reports 000G 000G 000G 78% /home abcdef.161210:/userlogs/logs/reports 000G 000G 000G 85% /home room.161211:/userlogs/logs/reports 000G 000G 000G 95% /home

Therefore it allows for filenames that:

  • Have differing lengths
  • Contain numeric characters

Breakdown:

  • sort -nrk5 -t ' ' : Initially sort by percentage in column 5
  • sort -t '.' -unk2 : Print unique results, only counting the date string from the first field (using a . separator)
4
  • you could use the -r in the first sort itself instead of piping it to the second one. Should achieve the same result. And upvoted, definitely a better resolution than mine.. Commented Dec 13, 2016 at 1:16
  • However, I see a problem with the -w10 what if the filenames aren't of the same length. The user posted it as xyz for all, but what if there were file names with different lengths, eg qwerty Commented Dec 13, 2016 at 1:17
  • @debal - Thanks - answer amended now using sort instead Commented Dec 13, 2016 at 2:38
  • Thanks. Somehow, I got the output not exactly in the highest% for each group, but the lowest%. Commented Dec 14, 2016 at 1:32
0

how about this awk

awk -F"[.: ]" '{if($(NF-1)+0>Arr[$2]+0){Arr[$2]=$(NF-1)+0;Res[$2]=$0}}END{for (i in Res){print Res[i]}}' file xyz.161210:/userlogs/logs/reports 355G 226G 121G 66% /home xyz.161211:/userlogs/logs/reports 355G 220G 127G 64% /home xyz.161209:/userlogs/logs/reports 355G 201G 145G 59% /home awk -F"[.: ]" '{ if($(NF-1)+0>Arr[$2]+0) { Arr[$2]=$(NF-1)+0; Res[$2]=$0 } } END{ for (i in Res) { print Res[i] } }' file 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.