I have an issue with computing md5sum. I have a recover tool -which archives file's metadata (inode) and also computes md5sum of them file(s) and stores them in sqlite db during installation. When the file gets removed/deleted . the tool recovers the deleted file using metadata from sqlite-db.It recovers file.Now ,I wanted to make sure recovered file is exactly same as original file.Thus recomputed the recovered files md5sum as shown below. The problem is ,strangely for few files,I can see (using cat) file content are exactly same (as before it was deleted) & stat command shows same output (except different inode number) but md5sum is different.
Following 2 files has same content - thus having different inode number doesn't affect md5sum.
764efa883dda1e11db47671c4a3bbd9e /test/hi1.txt 764efa883dda1e11db47671c4a3bbd9e /test/hi.txt Any thoughts, how I should proceed with this?
char file_location[512] = {0}; char md5_cmd[512], md5sum[34]; FILE *pf; //some recovery stuff goes here... //Recompute md5 of recovered file memset(md5_cmd, '\0', 512); sprintf(md5_cmd, "md5sum %s", file_location); pf = popen(md5_cmd, "r"); if (!pf) { fprintf(stderr,"Could not open pipe"); return; } // get data fgets(md5sum, 34, pf); if (pclose(pf) != 0) fprintf(stderr, "Error: close Failed."); fprintf(stdout, "Md5sum is %s", md5sum);
,I can see (using cat)What if there are stuff you cannot see ? Control characters, spaces vs tabs, newline at end of one file ? Do a hexdump on the files and compare the hex.