3

I have about 4k pictures with this format:

photo_6923@06-01-2022_14-18-36.jpg photo_6924@07-01-2022_00-03-23.jpg photo_6925@07-01-2022_01-36-20.jpg photo_6926@07-01-2022_10-44-20.jpg photo_6927@07-01-2022_10-44-20.jpg 

As you can guess, the first string after the @ and up to the underscore corresponds to the date, and from the underscore to the file extension, to the time.

Unfortunately the seconds do not show up, so that pictures taken at a given minute will have the same ending; e.g.

photo_6925@07-01-2022_01-36-20.jpg photo_6926@07-01-2022_10-44-20.jpg photo_6927@07-01-2022_10-44-20.jpg 

I would like to rename them like this

  • (1) Remove the photo_*.*@

  • (2) Change the format of the file to change "-" to "_"

  • (3) Change the format of the time _01h36m20s instead of _01-36-20

  • (4) If multiple instances of a file exist, do not delete them but add a suffix like

    photo_6926@07-01-2022_10-44-20.jpg ---> 07_01_2022_01h36m20s_00001.jpg

    photo_6927@07-01-2022_10-44-20.jpg ---> 07_01_2022_01h36m20s_00002.jpg

  • (5) Last problem: I have different folders with different pictures and unfortunately, after points 1-4 have been addressed, it is likely that there are two pictures in different folders with the same name, so that moving them to a new folder would overwrite some of them. A potential solution would be to keep increasing the counter of the suffix (this is why I have added five zeroes there). This means that

If name A exists in both, folders 1 and 2, when moving all of the pictures from 1 and 3 into a new folder, check whether multiple stances exit. If this is the case, increase the counter by one, i.e. imagine the following situation

$ ls 1/ 2/ 3/ 1/ 07_01_2022_01h36m20s_00001.jpg 07_01_2022_01h36m20s_00002.jpg 2/ 07_01_2022_01h36m20s_00002.jpg 3/ 

When moving them,

$ mv 1/* 2/* 3/, ls 3/ 07_01_2022_01h36m20s_00001.jpg 07_01_2022_01h36m20s_00002.jpg 07_01_2022_01h36m20s_00003.jpg <-------- counter added one here 

Thanks! PS: One could do this with metadata, but unfortunately all metadata has been wiped out from the pictures.

6
  • What operating system are you using? Commented Apr 1, 2024 at 18:55
  • It may not matter to you, but it's not obvious here whether the date on the first example file 06-01-2022 is 6th January or 1st June. Commented Apr 1, 2024 at 19:57
  • 2
    Facepalm. Why would you use 07_01_2022 rather than 2022_07_01? Commented Apr 2, 2024 at 2:05
  • Because I am not Chinese :) and I am following the format that we have here, which is not the anglosaxon one: day_month_year instead of month_day_year. My operating system is OpenBSD. Commented Apr 2, 2024 at 12:56
  • 2
    YYYY-mm-dd is an international standard (ISO8601) and unambiguous and helps with sorting. The middle-Endian MDY is not anglosaxon, it's more American; Britain (including England) and Saxony use mainly DMY. See en.wikipedia.org/wiki/List_of_date_formats_by_country Commented Apr 5, 2024 at 9:56

3 Answers 3

3

With zsh:

autoload zmv typeset -A n=() # 1 23 4 5 6 7 8 9 zmv -n '(**/)photo_<->@((<->)-(<->)-(<->)_(<->)-(<->)-(<->))(.jpg)(#qn.)' \ '$1${3}_${4}_${5}_${6}h${7}m${8}s_${(l[5][0])$((++n[\$2]))}$9' 

Remove -n (dry-run) if happy. On your sample, that gives:

mv -- photo_6923@06-01-2022_14-18-36.jpg 06_01_2022_14h18m36s_00001.jpg mv -- photo_6924@07-01-2022_00-03-23.jpg 07_01_2022_00h03m23s_00001.jpg mv -- photo_6925@07-01-2022_01-36-20.jpg 07_01_2022_01h36m20s_00001.jpg mv -- photo_6926@07-01-2022_10-44-20.jpg 07_01_2022_10h44m20s_00001.jpg mv -- photo_6927@07-01-2022_10-44-20.jpg 07_01_2022_10h44m20s_00002.jpg 

Explained:

  • typeset -A n=() creates an Associative array initially empty
  • zmv is an autoloadable function that takes a glob pattern (an extended glob one, zmv does set the extendedglob option for the duration of its execution) and expression as two separate arguments, and renames the files that match the pattern to the expansion of the expression inside which $1, $2... correspond to what is matched by the respective nth pair of (...) in the pattern.
  • For the pattern:
    • **/ matches any level of subdirectories (including 0), and that's inside the first (...), so will be available as $1 in the replacement. Note that zmv processes the files depth-first (leaves before the branches they're on) as if given the (#qod) glob qualifier which is generally important when renaming files, though not here as we're only renaming regular files, not directories.
    • <-> matches any positive decimal integer number. That's like <1-31> but with boundaries not specified, so matches <0-infinity> or IOW any sequence of decimal digits. You could change it to (<1-31>)-(<1-12)-(<1900-2100>)... if you wanted the matching to be stricter.
    • (#q...) adds glob qualifiers, n for the sorting to be numerical instead of the default of lexical (so that photo_10 comes after photo_2 for instance instead of before), and . to restrict the match to regular files (excluding all other types such as directories, symlinks, fifos...).
  • For the replacement:
    • ${(l[5][0])expansion} left pads the expansion to a length of 5 characters with 0s.
    • $(( ++n[\$2] )) expands to the value of the associative array element of key $2 (which is matched by the second (...), so ((<->)-(<->)-(<->)_(<->)-(<->)-(<->))) incremented by one. Note the \ to defer the dereferencing of that parameter, so it's not expanded inside the arithmetic expression. It would be a problem if for instance, $2 contained ] (which is not the case here).

To set the EXIF CreateDate based on the date in the filename:

exiftool -r -ext jpg -d '%d-%m-%Y_%H-%M-%S' \ -if '$Filename =~ /@\d+-\d+-\d+_\d+-\d+-\d+\.jpg\z/i' \ -'CreateDate<${FileName;s/.*@//;s/\.jpg\z//i}' . 

(shell independent)/

Explained:

  • -r: finds files recursively inside those files passed as arguments that are directories (here . the current working directory).
  • -ext jpg: only considers the files with a jpg extension (case insensitively).
  • -if 'perl expression': further restricts the filter to those for which the perl expression returns true.
  • $Filename =~ /@\d+-\d+-\d+_\d+-\d+-\d+\.jpg\z/i: whether the file name (without directory component) matches the given perl regular expression, so here ends in @<digits>-<digits>-<digits>_<digits>-<digits>-<digits>-.jpg.
  • -d '%d-%m-%Y_%H-%M-%S' use that strftime/strptime template for the formatting/parsing of dates.
  • -'CreateDate<date': sets the CreateDate EXIF metadata attribute to the specified date, here:
  • ${FileName;s/.*@//;s/\.jpg\z//i} the file name from which we have removed everything up to the right-most @ and the .jpg extension.
4
  • Thanks a lot, Stéphane! Could you explain a little bit the command, in particular the (#qn.) part and the counter? I am also not sure about the $9 bit. How to address point 5? Any idea? Thanks a lot in any case. Commented Apr 2, 2024 at 11:10
  • @NixChaotian, point 5 is addressed by renaming all files in all folders at once. Commented Apr 2, 2024 at 11:14
  • You are right; my exposition was wrong - Initially, two or more different folders might have identical filenames corresponding to different pictures. This is because I am downloading them from different Telegram chats and, if unlucky, I might run into that problem. In any case, I think I can address it in a different way. There must be a way to check whether two files are identical in spite of having the same name. Commented Apr 2, 2024 at 12:59
  • @NixChaotian see edit for the explanation. Commented Apr 5, 2024 at 9:39
0

I've explored a solution to this using TXR Lisp. It's a bit verbose in its present state. It parses the names into structures.

For the requirement that there may be multiple directories which will result in clashing names, we should just deal with all the directories simultaneously. I prepared this example set of paths:

path/a/photo_6923@06-01-2022_14-18-36.jpg path/a/photo_6924@07-01-2022_00-03-23.jpg path/a/photo_6925@07-01-2022_01-36-20.jpg path/a/photo_6926@07-01-2022_10-44-20.jpg path/a/photo_6927@07-01-2022_10-44-20.jpg path/to/b/photo_6923@06-01-2023_14-18-36.jpg path/to/b/photo_6924@07-01-2023_00-03-23.jpg path/to/b/photo_6925@07-01-2023_01-36-20.jpg path/to/b/photo_6926@07-01-2023_10-44-20.jpg path/to/b/photo_6927@07-01-2022_10-44-20.jpg 

We have names in a path/a directory and a path/to/b directory. There are clashing entries that have the same time/date 07-01-2022_10-44-20.

In the real program we would use some glob expression to get the names like:

(glob "{path/a,path/to/b}/photo_*.jpg") 

rather than reading from a file, and we could replace the dummy-rename function with rename-path.

Run:

$ txr rename.tl path/to/b/photo_6926@07-01-2023_10-44-20.jpg -> path/to/b/07_01_2023_10h44m20s path/a/photo_6926@07-01-2022_10-44-20.jpg -> path/a/07_01_2022_10h44m20s_00000 path/a/photo_6927@07-01-2022_10-44-20.jpg -> path/a/07_01_2022_10h44m20s_00001 path/to/b/photo_6927@07-01-2022_10-44-20.jpg -> path/to/b/07_01_2022_10h44m20s_00002 path/to/b/photo_6924@07-01-2023_00-03-23.jpg -> path/to/b/07_01_2023_00h03m23s path/a/photo_6924@07-01-2022_00-03-23.jpg -> path/a/07_01_2022_00h03m23s path/to/b/photo_6925@07-01-2023_01-36-20.jpg -> path/to/b/07_01_2023_01h36m20s path/a/photo_6925@07-01-2022_01-36-20.jpg -> path/a/07_01_2022_01h36m20s path/to/b/photo_6923@06-01-2023_14-18-36.jpg -> path/to/b/06_01_2023_14h18m36s path/a/photo_6923@06-01-2022_14-18-36.jpg -> path/a/06_01_2022_14h18m36s 

Code in rename.tl:

(defstruct name () orig dir number time (:method fmt (me) (let ((tm me.time)) `@{tm.month}_@{tm.day}_@{tm.year}_@{tm.hour}h@{tm.min}m@{tm.sec}s`))) (defun parse (str) (let ((dir (dir-name str)) (base (base-name str))) (match `photo_@num\@@mm-@dd-@{yyyy}_@HH-@[email protected]` base (new name orig str dir dir number num time (new time year yyyy month mm day dd hour HH min MM sec SS))))) (defun dummy-rename (from dir to) (put-line `@from -> @(path-cat dir to)`)) (flow (file-get-lines "data") (mapcar parse) (group-by .time) (dohash (date part @1) (if (eql 1 (len part)) (let ((n (first part))) (dummy-rename n.orig n.dir n.(fmt))) (each ((n part) (i "00000".."99999")) (dummy-rename n.orig n.dir `@{n.(fmt)}_@i`))))) 

Only the files that clash to the same date/time get the incrementing stamp. Since this is across multiple directories, we don't need five digits.

The core of the algorithm is to take the parsed objects and group them into groups based on the date. For those groups which only contain one element, we format the name without the additional counters. Within the groups of two or more, we iterate them, while iterating over 00000 to 99999 in parallel, and tack that on as a suffix.

The renames aren't sorted, since group-by produces a hash table. We can see the organization better with a sort:

$ txr rename.tl | sort path/a/photo_6923@06-01-2022_14-18-36.jpg -> path/a/06_01_2022_14h18m36s path/a/photo_6924@07-01-2022_00-03-23.jpg -> path/a/07_01_2022_00h03m23s path/a/photo_6925@07-01-2022_01-36-20.jpg -> path/a/07_01_2022_01h36m20s path/a/photo_6926@07-01-2022_10-44-20.jpg -> path/a/07_01_2022_10h44m20s_00000 path/a/photo_6927@07-01-2022_10-44-20.jpg -> path/a/07_01_2022_10h44m20s_00001 path/to/b/photo_6923@06-01-2023_14-18-36.jpg -> path/to/b/06_01_2023_14h18m36s path/to/b/photo_6924@07-01-2023_00-03-23.jpg -> path/to/b/07_01_2023_00h03m23s path/to/b/photo_6925@07-01-2023_01-36-20.jpg -> path/to/b/07_01_2023_01h36m20s path/to/b/photo_6926@07-01-2023_10-44-20.jpg -> path/to/b/07_01_2023_10h44m20s path/to/b/photo_6927@07-01-2022_10-44-20.jpg -> path/to/b/07_01_2022_10h44m20s_00002 

We can clearly see that in the second directory, the clash is avoided by the name 07_01_2022_10h44m20s_00002.

0

With GNU Parallel:

parallel mv -- {} '{=s/(.*)photo_\d+\@((..)-(..)-(....)_(..)-(..)-(..))(.jpg)/$_="$1$3_$4_$5_$6h$7m$8s_".sprintf("%05d",++$n{$2}).$9/e =}' ::: dira/* dirb/* 

All filenames will now be unique and can be moved to the same folder.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.