38

I have the md5sum of a file and I don't know where it is on my system. Is there any easy option of find to identify a file based on its md5? Or do I need to develop a small script ?

I'm working on AIX 6 without the GNU tools.

2
  • 4
    Wouldn't narrowing the search to file sizes of the same size then computing the md5 be faster? Commented Mar 11, 2014 at 20:49
  • @RJ- yes maybe but in this case it also allow me to check if the file is the correct one and has been transfer correctly. Commented Mar 12, 2014 at 8:28

4 Answers 4

41

Using find:

find /tmp/ -type f -exec md5sum {} + | grep '^file_md5sum_to_match' 

If you searching through / then you can exclude /proc and /sys see following find command example :

Also I had done some testing, find take more time and less CPU and RAM where ruby script is taking less time but more CPU and RAM

Test Result

Find

[root@dc1 ~]# time find / -type f -not -path "/proc/*" -not -path "/sys/*" -exec md5sum {} + | grep '^304a5fa2727ff9e6e101696a16cb0fc5' 304a5fa2727ff9e6e101696a16cb0fc5 /tmp/file1 real 6m20.113s user 0m5.469s sys 0m24.964s 

Find with -prune

[root@dc1 ~]# time find / \( -path /proc -o -path /sys \) -prune -o -type f -exec md5sum {} + | grep '^304a5fa2727ff9e6e101696a16cb0fc5' 304a5fa2727ff9e6e101696a16cb0fc5 /tmp/file1 real 6m45.539s user 0m5.758s sys 0m25.107s 

Ruby Script

[root@dc1 ~]# time ruby findm.rb File Found at: /tmp/file1 real 1m3.065s user 0m2.231s sys 0m20.706s 
3
  • 1
    You want to call -prune on /sys//proc instead of descending in them and exclude files with -path. You should prefer ! over -not for portability. Commented Mar 11, 2014 at 11:53
  • Sir I've updated with -prune, once check if it is OK. Commented Mar 11, 2014 at 12:34
  • 1
    You also want to exclude /dev certainly. Commented Mar 12, 2014 at 10:21
12

Script Solution

#!/usr/bin/ruby -w require 'find' require 'digest/md5' file_md5sum_to_match = [ '304a5fa2727ff9e6e101696a16cb0fc5', '0ce6742445e7f4eae3d32b35159af982' ] Find.find('/') do |f| next if /(^\.|^\/proc|^\/sys)/.match(f) # skip next unless File.file?(f) begin md5sum = Digest::MD5.hexdigest(File.read(f)) rescue puts "Error reading #{f} --- MD5 hash not computed." end if file_md5sum_to_match.include?(md5sum) puts "File Found at: #{f}" file_md5sum_to_match.delete(md5sum) end file_md5sum_to_match.empty? && exit # if array empty then exit end 

Bash Script solution based on probability which works faster

#!/bin/bash [[ -z $1 ]] && read -p "Enter MD5SUM to search file: " md5 || md5=$1 check_in=( '/home' '/opt' '/tmp' '/etc' '/var' '/usr' ) last_find_cmd="find / \\( -path /proc -o -path /sys ${check_in[@]/\//-o -path /} \\) -prune -o -type f -exec md5sum {} +" last_element=${#check_in} echo "Please wait... searching for file" for d in ${!check_in[@]} do [[ $d == $last_element ]] && eval $last_find_cmd | grep "^${md5}" && exit find ${check_in[$d]} -type f -exec md5sum {} + | grep "^${md5}" && exit done 

Test Result

[root@dc1 /]# time bash find.sh 304a5fa2727ff9e6e101696a16cb0fc5 Please wait... searching for file 304a5fa2727ff9e6e101696a16cb0fc5 /var/log/file1 real 0m21.067s user 0m1.947s sys 0m2.594s 
6
  • which would you recommend ? Commented Mar 11, 2014 at 10:21
  • @Kiwy I'm not recommend, Just for practice Commented Mar 11, 2014 at 10:22
  • @Kiwy once look at test result and let me know and also do some testing from your side and show us the result, It would be great to see result on AIX. :D Commented Mar 11, 2014 at 10:53
  • My main issue with your script is that it needs ruby and it's not install on my System, and I'm not admin. but I will run some test tonight if I find some time Commented Mar 11, 2014 at 10:54
  • It seems faster than find in the end ^^. maybe you could put the md5sum in a thread so you can compute 5 md5sum at the same time it could save also a bit of time Commented Mar 11, 2014 at 11:01
8

If you decide to install gnu find anyway (and since you indicated interest in one of your comments), you can try something like:

find / -type f \( -exec checkmd5 {} YOURMD5SUM \; -o -quit \) 

and have checkmd5 compare the md5sum of the file it gets as argument compare to the second argument and print the name if it matches and exit with 1 (instead of 0 otherwise). The -quit will have find stop once it is found.

checkmd5 (not tested):

#!/bin/bash md=$(md5sum $1 | cut -d' ' -f1) if [ $md == $2 ] ; then echo $1 exit 1 fi exit 0 
5
  • Yum No package checkmd5 available, please include which package need to be install for checkmd5 Commented Mar 11, 2014 at 10:59
  • I like this solution too bad I don't get checkmd5 but I like the way you do it Commented Mar 11, 2014 at 11:00
  • @kiwy script added. Commented Mar 11, 2014 at 11:06
  • 1
    @RahulPatil it is in the DIY distribution ;-) Commented Mar 11, 2014 at 11:06
  • @kiwy Sorry could have accepted your edit for -type f, but it undeleted my echo $1 I already had put in Commented Mar 11, 2014 at 11:15
1

For people running macOS and stumbling on this page: you have to use md5 instead of md5sum or checkmd5, i.e.:

find . -type f -exec md5 {} + | grep 'file_md5sum_to_match' 

Caveat: also don't put ^ before file_md5sum_to_match otherwise it will never match anything since md5 prints the filename before its md5 sum.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.