2

I have a large directory of files last modified over the past several years through now. Is there an easy command or commands I can type one-time in an interactive bash shell that can create subdirectories with the name of each subdirectory being a four-digit year, and move respective files into each subdirectory when I cannot rely on any information in the file name regarding the age of the file, and finally verify everything I ran worked properly and I didn't lose any files or data?

For example given this completely fake example:

$ cd ~/Documents $ ls -lhrt ... -rw-r--r-- 4 user user 4.0K Jun 29 2017 oldfile.txt -rw-r--r-- 4 user user 4.0K May 15 2018 2018file.md -rw-r--r-- 4 user user 4.0K Apr 14 2019 04.dat -rw-r--r-- 4 user user 4.0K Jul 21 2019 somepage.html drw-r--r-- 4 user user 4.0K Jul 21 2019 somepage_files -rw-r--r-- 4 user user 4.0K Mar 13 2020 march.dat -rw-r--r-- 4 user user 4.0K Feb 12 18:03 file02.dat -rw-r--r-- 4 user user 4.0K Oct 11 18:03 OctReport.txt 

When I'm done, I want to end up with the following:

$ cd ~/Documents $ find . . ./2017 ./2017/oldfile.txt ./2018 ./2018/2018file.md ./2019 ./2019/04.dat ./2019/somepage.html ./2019/somepage_files ./2019/somepage_files/... ./2020 ./2020/march.dat ./2021 ./2021/file02.dat ./2021/OctReport.txt 

3 Answers 3

1

You can use a for loop with date to get the modification year of each file (here assuming GNU date or compatible for its -r option) to create the directories with. Each file will then be moved to their respective directory.

for file in *; do [ ! -L "$file" ] && dir_name=$(date -r "$file" +%Y) && mkdir -p "$dir_name" && mv -- "$file" "$dir_name" done 
5
  • 3
    Since multiple files will map to the same directory and you are doing mkdir for each file, you are llikely to end up with lots of "cannot create directory: File exists" error messages from mkdir. You should probably either check if the directory exists already or use mkdir -p to avoid the error message. Commented Dec 22, 2021 at 4:36
  • Thanks, that is cleaner than redirecting those errors to /dev/null. Commented Dec 22, 2021 at 4:42
  • I stated I wanted directories included as well, so I ran without the [ -f "$file" ] && but then discovered I had a couple symlinks I wanted excluded, so I replaced with [ \! -L "$file" ] && and it appears to have worked as expected. Commented Dec 22, 2021 at 7:54
  • @jia103 My initial answer did in fact include directories before it was edited by Stéphane but it did not exclude symlinks as that was not noted in the question. Regardless, nice to hear you got it working as expected. Commented Dec 22, 2021 at 17:26
  • @jia103 Sorry about that, I added the [ -f "$file" ] to avoid moving the year directories themselves. Note that the modification time of a directory doesn't reflect the age of the content of the files in them, only the last time a file was added / removed / renamed in them. You could do [[ $file != [12][0-9][0-9][0-9] ]] to exclude the year directories themselves. Commented Dec 22, 2021 at 17:52
1

Using the perl rename utility (not to be confused with rename from util-linux or any other rename):

$ rename -n -e 'BEGIN {use POSIX}; next unless -f $_; my $Y = strftime "%Y", localtime((stat $_)[9]); mkdir $Y unless -d $Y; s=^=$Y/=' * rename(04.dat, 2019/04.dat) rename(2018file.md, 2018/2018file.md) rename(file02.dat, 2021/file02.dat) rename(march.dat, 2020/march.dat) rename(OctReport.txt, 2021/OctReport.txt) rename(oldfile.txt, 2017/oldfile.txt) rename(somepage_files, 2019/somepage_files) rename(somepage.html, 2019/somepage.html) 

The -n option makes this a dry-run. Remove it (or replace with -v for verbose output) when you're sure it's going to do what you want. e.g.

$ find . -type f -ls 864326 9 drwxr-xr-x 2 cas cas 10 Dec 22 18:08 . 863863 1 -rw-r--r-- 1 cas cas 0 Oct 11 18:03 ./OctReport.txt 863999 1 -rw-r--r-- 1 cas cas 0 Jul 21 2019 ./somepage_files 864123 1 -rw-r--r-- 1 cas cas 0 Mar 13 2020 ./march.dat 863997 1 -rw-r--r-- 1 cas cas 0 May 15 2018 ./2018file.md 864122 1 -rw-r--r-- 1 cas cas 0 Apr 14 2019 ./04.dat 863862 1 -rw-r--r-- 1 cas cas 0 Feb 12 2021 ./file02.dat 863998 1 -rw-r--r-- 1 cas cas 0 Jul 21 2019 ./somepage.html 863996 1 -rw-r--r-- 1 cas cas 0 Jun 29 2017 ./oldfile.txt $ rename -v -e 'BEGIN {use POSIX}; next unless -f $_; my $Y = strftime "%Y", localtime((stat $_)[9]); mkdir $Y unless -d $Y; s=^=$Y/=' * 04.dat renamed as 2019/04.dat 2018file.md renamed as 2018/2018file.md file02.dat renamed as 2021/file02.dat march.dat renamed as 2020/march.dat OctReport.txt renamed as 2021/OctReport.txt oldfile.txt renamed as 2017/oldfile.txt somepage_files renamed as 2019/somepage_files somepage.html renamed as 2019/somepage.html $ find . -type f -ls 863996 1 -rw-r--r-- 1 cas cas 0 Jun 29 2017 ./2017/oldfile.txt 863862 1 -rw-r--r-- 1 cas cas 0 Feb 12 2021 ./2021/file02.dat 863863 1 -rw-r--r-- 1 cas cas 0 Oct 11 18:03 ./2021/OctReport.txt 863999 1 -rw-r--r-- 1 cas cas 0 Jul 21 2019 ./2019/somepage_files 864122 1 -rw-r--r-- 1 cas cas 0 Apr 14 2019 ./2019/04.dat 863998 1 -rw-r--r-- 1 cas cas 0 Jul 21 2019 ./2019/somepage.html 864123 1 -rw-r--r-- 1 cas cas 0 Mar 13 2020 ./2020/march.dat 863997 1 -rw-r--r-- 1 cas cas 0 May 15 2018 ./2018/2018file.md 

see perldoc -f stat and perldoc -f localtime for details on those perl functions. Details on strftime() can be found in perldoc POSIX.

BTW, rename can take filenames as args on the command line, or from stdin (even NUL-separated filenames if you use rename's -0 option, which is useful when dealing with filenames containing newlines and other annoying characters). See man rename.

1

find may be of great use here - note that printf needs GNU-find:

  1. Create necessary directories

    find -maxdepth 1 -mindepth 1 -type f -printf '%TY\n' | sort -u | xargs mkdir

  2. Move files accordingly:

    find -maxdepth 1 -mindepth 1 -type f -printf '%f\0%TY/%f\0' | xargs -0 -l2 mv --

Using NUL-delimited names in step 2 makes this approach very robust towards filenames with spaces, newlines, etc.

When it comes to including directories (your text says files, but your example includes directories), one would have to exclude the year-named directories in step 2:

find -maxdepth 1 -mindepth 1 ! -name '[12][0129][0-9][0-9]' -printf '%f\0%TY/%f\0' | xargs -0 -l2 mv 

... assuming that you have no files older than the year 1000 or newer than 2999, hopefully.


Maybe you also want to update the age of the directories? Then run:

for dir in [12][0129][0-9][0-9] ; do touch -d ${dir}-01-01 $dir done 

That way the age of the directory is set to first of Jan of the respective year and future, date-based searches are simplified.

3
  • 1
    -printf(%TY\n) What shell syntax is that? Also [1,2] matches on 1, , or 2. -printf is GNU-specific. Commented Dec 22, 2021 at 17:54
  • @StéphaneChazelas [1,2] - mixup with shell brace expansion thanks & corrected. -printf being GNU -> I was not aware, good to mention. Regarding the syntax: I cannot follow your question? %Tx is last modification time in format x, Y being 4-digit year. \n just to separate the entries for xargs - well knowing that in YYYY there are no dangerous characters (if executed in the current dir). Commented Dec 22, 2021 at 18:42
  • 1
    If that's meant to be bash syntax, it should be -printf '%TY\n', not -printf(%TY\n). !-name should be ! -name. mv should be mv -- Commented Dec 22, 2021 at 19:11

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.