I have 2 external storage devices of 1TB each and I want to backup all of this to a server. I want to use rsync to do this but I have found that of ~100,000 files on each device, ~80,000 files are the same (have the same name and directory path). I could rsync both of these separately which would merge the files, but I want a way to find out if the 'mutual' files contain the same content, because I dont want to lose a modified file if they have been modified. Is there a way of checking for this using rsync?
- so then your question is: "how do I backup only modified files without overwriting exiting files?". Simply put, you want to place files from location A into location B without overwriting any files in location B, but are concerned because the directory structure/filenames are identical. I'm just trying to gain clarification.Centimane– Centimane2016-12-20 15:20:03 +00:00Commented Dec 20, 2016 at 15:20
- Kind of, more I would like to backup files from location A and location B, into location C. However there are alot of cross over in location A and B with the same file name, but I'm not sure whether they have been updated/modified - therefore if I move all files from location A -> C, and then B -> C, then some files from location B will overwrite the 'same' file from location Atrouselife– trouselife2016-12-20 15:57:13 +00:00Commented Dec 20, 2016 at 15:57
3 Answers
You can consider to use the "-c" flag of rsync which checks for the checksum of the files. If the mod times and sizes are same. then it would do a checksum identifying if the files have identical content. More on it is here ==> https://serverfault.com/questions/211005/rsync-difference-between-checksum-and-ignore-times-options
Subsequently, to sync only updated or modified files on the remote machine that have changed on the local machine, we can perform a dry run before copying files as below:
rsync -av --dry-run --update Documents/* [email protected]:/<directory> and if the result is ok use this
rsync -av --update Documents/* [email protected]:/<directory> For more info see: http://www.tecmint.com/sync-new-changed-modified-files-rsync-linux/
For both trasfers you can use:
rsync --ignore-existing -i dir1/ [email protected]:/dir2/ For the first transfer it will simply transfer all the files and list them out.
For the second transfer it will leave out files that already exist (--ignore-existing) and list all the files that it actually copied (-i). You could then chose to copy over the files that were left out of the transfer by using find to get a list of all the files, and removing the files listed in the output of transfer 2.
Unfortunately there isn't an option to copy the file to a slightly different filename if it already exists, that would require some extra logic and a loop.