3

I have recovered files and currently they have this structure:

root/MD5_of_file1/file1 root/MD5_of_file2/file2 ... root/MD5_of_filen/filen 

Obviously, duplicates are now in the same folder. The filename contains no information, just the block number at which it was found during the recovery.

I want to flatten the structure, keeping only one file for each MD5. How could I do this efficiently ?

Just to be clear, here are some actual data:

feceee0fc150d191c5fd48ca6acee2f6 feceee0fc150d191c5fd48ca6acee2f6/f225407559.odt feceee0fc150d191c5fd48ca6acee2f6/f94654911.odt e905bb0a76c0055a2be1b8285d39c715 e905bb0a76c0055a2be1b8285d39c715/f0702423.odt e905bb0a76c0055a2be1b8285d39c715/f26479232.odt e905bb0a76c0055a2be1b8285d39c715/f3084695.odt 

I want to flatten to something like this:

f225407559.odt f0702423.odt 

but there are no guarantees that filenames are distinct. Files could easily be renamed to the corresponding MD5 of their content, which is already computed, as it is the name of the folder in which they currently are.

2
  • ls root/MD5_of_file1/file* | grep -v 'root/MD5_of_file1/file1' | while read f ; do rm "$f" ; done Commented Nov 12, 2015 at 16:12
  • @younes I'm pretty sure I've read something about not parsing the output of ls.. but thanks for your comment. Commented Nov 12, 2015 at 16:20

2 Answers 2

3
for i in *(/); do mv $i/*([1]) $i.odt; rm -rf $i; done 

It's using zsh glob qualifier: *([1]) selects the first file in alphanumerical order

2
  • And just rm -rf $i, to ditch the directory altogether. Commented Nov 12, 2015 at 17:33
  • That's better. A different approach (that should also work with other shells) would be something like this: for d in ./*/; do set -- "${d}"*; mv "$1" "${d%/*}.odt" && rm -rf "$d";done Commented Nov 12, 2015 at 21:25
0

In two steps:

perl-rename 's;/([^/]*)/[^/]*$;/\1_file;' foo/**/* rmdir foo/**/ 

Example:

$ tree foo foo ├── e905bb0a76c0055a2be1b8285d39c715 │   ├── f0702423.odt │   ├── f26479232.odt │   └── f3084695.odt └── feceee0fc150d191c5fd48ca6acee2f6 ├── f225407559.odt └── f94654911.odt 2 directories, 5 files $ perl-rename -n 's;/([^/]*)/[^/]*$;/\1_file;' foo/**/* foo/e905bb0a76c0055a2be1b8285d39c715/f0702423.odt -> foo/e905bb0a76c0055a2be1b8285d39c715_file foo/e905bb0a76c0055a2be1b8285d39c715/f26479232.odt -> foo/e905bb0a76c0055a2be1b8285d39c715_file foo/e905bb0a76c0055a2be1b8285d39c715/f3084695.odt -> foo/e905bb0a76c0055a2be1b8285d39c715_file foo/feceee0fc150d191c5fd48ca6acee2f6/f225407559.odt -> foo/feceee0fc150d191c5fd48ca6acee2f6_file foo/feceee0fc150d191c5fd48ca6acee2f6/f94654911.odt -> foo/feceee0fc150d191c5fd48ca6acee2f6_file $ perl-rename 's;/([^/]*)/[^/]*$;/\1_file;' foo/**/* $ rmdir foo/**/ rmdir: failed to remove ‘foo/’: Directory not empty $ tree foo foo ├── e905bb0a76c0055a2be1b8285d39c715_file └── feceee0fc150d191c5fd48ca6acee2f6_file 0 directories, 2 files 

Another way, using find, sort and awk:

find foo -type f | sort -k2,2 -u -t/ | awk -F/ -v OFS=/ '{path=$0; file=$NF; NF--; cmd = "cp " path " " $0 "_" file; ; system(cmd); system("rm -r "$0)}' 

Example:

$ find foo -type f | sort -k2,2 -u -t/ | awk -F/ -v OFS=/ '{path=$0; file=$NF; NF--; cmd = "cp " path " " $0 "_" file; ; print cmd; print "rm -r "$0}' cp foo/e905bb0a76c0055a2be1b8285d39c715/f3084695.odt foo/e905bb0a76c0055a2be1b8285d39c715_f3084695.odt rm -r foo/e905bb0a76c0055a2be1b8285d39c715 cp foo/feceee0fc150d191c5fd48ca6acee2f6/f225407559.odt foo/feceee0fc150d191c5fd48ca6acee2f6_f225407559.odt rm -r foo/feceee0fc150d191c5fd48ca6acee2f6 $ find foo -type f | sort -k2,2 -u -t/ | awk -F/ -v OFS=/ '{path=$0; file=$NF; NF--; cmd = "cp " path " " $0 "_" file; ; system(cmd); system("rm -r "$0)}' $ tree foo foo ├── e905bb0a76c0055a2be1b8285d39c715_f3084695.odt └── feceee0fc150d191c5fd48ca6acee2f6_f225407559.odt 0 directories, 2 files 
5
  • So does the first step overwrites each file in each subfolder with its siblings ? I had in mind something that would pick one and drop the others, but if it get the job done.. Commented Nov 12, 2015 at 16:34
  • @AntoineLecaille overwrites each file. I can't think of a simple command to do it the other way. :( Commented Nov 12, 2015 at 16:37
  • I'm on something with zsh glob qualifiers.. Commented Nov 12, 2015 at 16:44
  • @AntoineLecaille see update. Commented Nov 12, 2015 at 16:56
  • 1
    I'd mv the file you keep, or at least cp -p, and awk can do the select-first: find foo -type f | awk -F/ -vOFS=/ '{path=$0;file=$NF;NF--} !already[$0]++ {system("mv "path" "$0"_"file"); system("rm -r "$0)}' Commented Nov 12, 2015 at 22:18

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.