1

This question arose from another question I had here ("How to extract basename of parent directory in shell"), which seems to have opened the "rabbit hole" down the Unix string manipulations for me. So, here goes supplementary question:

What is the correct way to extract various parts ("levels") from dirname results combined with find?

Let's assume I have the following hierarchy:

DE_AT/adventure/motovun/300x250/A2_300x250.zip 

I "find" the file like so:

find . -name "*.zip" 

execute shell on the findresults:

-exec sh -c '' {} \; 

How would I extract each part of the full path? How do I get:

  • DE_AT
  • adventure
  • motovun
  • 300x250
  • A2_300x250.zip

This is what I know so far:

basename "$1" # gets me: A2_300x250.zip dirname "$1" # gets me: ./DE_AT/adventure/motovun/300x250 

I am asking this because I need to rename this .zip files into someString_DE_AT_motovun+A2_300x250.zip.

I came up with a horrible frankensolution like so:

find . -name "*.zip" -exec sh -c ' mv "$0" "myString_$(basename $(dirname $(dirname \ $(dirname "$0")_...+$(basename "$0")" ' {} \; 

I don't even wish to try this because this simply cannot be correct.

2 Answers 2

4

You can use the split+glob operator:

find . -name '*.zip' -exec sh -c ' IFS=/ # split on / set -f # disable glob for file do set -- $file # invoke split+glob, store in positional parameters # now the path components are in $1, $2... mv -i -- "$file" "someString_${2}_${4}+${6}" done' sh {} + 

$1 would have ., $2 DE_AT and so on. To get the last argument, it becomes tricky, as you need something like:

eval "last=\${$#}" 

It may be easier to use a different shell like zsh which has proper split operators and arrays for that:

find . -name '*.zip' -exec zsh -c ' for file do components=(${(s:/:)file}) printf "Last component: %s\n" $components[-1] mv -i -- "$file" "someString_$components[2]_$components[-3]+$components[-1]" done' zsh {} + 

With zsh, you can also use its zmv batch-renaming tool:

autoload zmv # best in ~/.zshrc zmv -n '([^/]#)/**/(*)/*/(*.zip)' 'someString_${1}_${2}+$3' 

The **/ part matches any level (including 0) of subdirectories, so it will match on (a)/b/c/(d)/e/(f.zip) or (a)/(b)/c/(d.zip) with the captured strings (a/d/f.zip, a/b/d.zip) going in $1/$2/$3 for the replacement so as to get a similar behaviour as for the $components array approach above.

The [^/]# part where # is like the regexp * operator, matches any sequence of non-/. For globs, * works the same as * cannot go across a /, but after expanding the glob, zmv uses pattern matching on the resulting files to extract the parts for the replacement, and there, * would go a across a / so (*) in place of ([^/]#) would match too much.

6
  • Stephane, can you please elaborate a bit on the zshsolution? I am trying to get man zmv but I have no entries :(. What is the -n switch and why $1, $2, $3 when in first one you are using 2, 4 and 6? Sorry for being a total noob here :/ Commented Aug 31, 2016 at 11:51
  • 1
    @AlexStarbuck see edit. I already gave the link to the zmv doc in my answer to your other question. For zsh like for any biggish manual like bash's, I'd use info instead of man. If you do info zsh, type i to get the index, enter zmv (completion abailable), it should take you to the zmv documentation. Commented Aug 31, 2016 at 12:09
  • If I pressi while in info zsh I get the message: "No indices available". Commented Aug 31, 2016 at 12:20
  • 1
    @AlexStarbuck, you probably don't have the info pages for zsh installed on your system, so info zsh only gives you a dump of the man page. If on a Debian-like system, you may need to install the zsh-doc package. Commented Aug 31, 2016 at 12:22
  • Stephane, your zsh solution works brilliantly :) but also moves the renamed .zip files to top of the hierarchy (to a containing folder banners/, which holds /DE_AT/, /DE_DE/ and /DE_CH/ folders; how can I not move them?) Commented Aug 31, 2016 at 12:30
0

Is using only find's exec a strict requirement ? I'd rather loop on find results and combine it with a string-manipulation-friendly tool like awk :

for ii in $(find . -name "*.zip") do mv $ii $(echo $ii|awk -F/ '{print "someString_" $2 "_" $4 "+" $6}') done 

(Replace mv by echo mv for testing purposes.)

NB : -F/ option of awk sets / as the separator instead of whitespaces and tabulations.

Update

As suggested in comments by Stéphane, it would probably be wiser and more robust to tune the split+glob operator (more information about it here) beforehand :

IFS=$'\n' set -f 

The former line is mandatory anyway if your filenames contains spaces, and the second line if your filenames contains wildcards.

Don't forget to switch them to previous settings afterwards if you don't want to tear your hair out because of "strange" behaviour later… Assuming you haven't customized these settings :

unset IFS set +f 
4
  • It is not a requirement at all :), it's just something I learned so far. I find the existing shell scripting tutorial quite awful for newbies. Commented Aug 30, 2016 at 15:15
  • 1
    Here, you're using the split+glob operator on the output of find, but are not tuning it properly. You'd want to set IFS to newline and disable the glob part. There's no reason you'd want to invoke it on $ii Commented Aug 30, 2016 at 15:34
  • @StéphaneChazelas I just learned a number of things reading your answer here, thanks ! I feel it's not worth tuning IFS and the globbing if filenames are "standard", but it's clearly something to be aware of. Commented Aug 30, 2016 at 19:51
  • 1
    @AlexStarbuck Here is the result : mv ./DE_AT/adventure/motovun/300x250/A2_300x250.zip someString_DE_AT_motovun+A2_300x250.zip. I tell awk (option -F/) to split the string using / (by default it splits on blanks). $i refers to the ith field ($0 is the whole string). Since find prefixes its output with the directory where it searches, which we set to ., the first field ($1) is . — therefore $2=DE_DE. Commented Aug 31, 2016 at 13:12

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.