4

I have a directory similar to the following:

-rw-r--r-- 1 root root 223K Apr 28 14:25 2015.04.28_14.25 -rw-r--r-- 1 root root 253K Apr 28 14:55 2015.04.28_14.55 -rw-r--r-- 1 root root 276K Apr 28 15:25 2015.04.28_15.25 -rw-r--r-- 1 root root 254K Apr 28 15:55 2015.04.28_15.55 -rw-r--r-- 1 root root 122K Apr 29 09:08 2015.04.29_09.08 -rw-r--r-- 1 root root 127K Apr 29 09:38 2015.04.29_09.38 -rw-r--r-- 1 root root 67K Apr 29 11:43 2015.04.29_11.43 -rw-r--r-- 1 root root 137K May 1 12:13 2015.04.29_12.13 -rw-r--r-- 1 root root 125K May 1 12:43 2015.04.29_12.43 -rw-r--r-- 1 root root 165K May 1 13:13 2015.04.29_13.13 -rw-r--r-- 1 root root 110K May 1 13:43 2015.04.29_13.43 

My question is, how would I find the largest file from each date?

For example, largest file from Apr 28, largest from Apr 29, May 1, etc.

OS info: Linux Kali 3.18.0-kali3-amd64 #1 SMP Debian 3.18.6-1~kali2 (2015-03-02) x86_64 GNU/Linux

4
  • What OS? Do you have GNU stat available? Commented Jun 3, 2015 at 15:30
  • @jordanm OS details added. Yes I do. Commented Jun 3, 2015 at 15:49
  • Is it the date in the file name or the modification time that matters? Commented Jun 3, 2015 at 21:36
  • @Gilles The modification time. Commented Jun 4, 2015 at 8:18

3 Answers 3

5

On GNU/anything,

ls -l --time-style=+%s \ | awk '{$6 = int($6/86400); print}' \ | sort -nk6,6 -nrk5,5 \ | sort -sunk6,6 

That will get you UTC boundaries, add your local time offset to the calc as needed,e.g. int(($6-7*3600)/86400) for -0700 midnight boundaries.

2
  • Thanks. Marked as answer. Could you explain a bit more about what this is doing though? Commented Jun 4, 2015 at 8:20
  • 1
    size is the fifth field, date's the sixth, +%s is seconds since 1 Jan 1970 00:00 UTC, that /86400 is days since, so the first sort is by day and descending size and the second is "stable" -- give up speed to keep things in input order when you can, "unique" -- select only the first record for each key. First one it sees in input order for each day will be the largest. Commented Jun 4, 2015 at 9:22
2

An approach based on stat to obtain the file information and awk to determine the maximum for each date:

stat -c $'%.10y\t%s\t%n' * | awk 'BEGIN { FS=OFS="\t" } s[$1]<$2 { s[$1]=$2 ; n[$1]=$3 } END { for (d in n) print d,s[d],n[d] | "sort" }' 

The output will be a Tab separated list of (date, size, filename) tuples.

0

I would script it around the use of 'ls' to get the files in size order (and then limit the result to get the largest.

For example, if you do 'ls -lS 2015.04.29*' you will get a list of the files in descending size order. Or 'ls -lS 2015.04.29*|head -1' should give you the largest.

From there you can strip out just the file name etc. depending on your need (as well as loop through all the dates found in the filenames to get the largest for all dates etc). Essentially just a for loop based on the dates and the ls commands to get the largest for each date.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.