I am hoping to achieve some cleanup functionality on about 20TB for my NAS with rsync in linux by excluding whole directories and contents for directories that would contain a ".protect" file
I generate really large caches in subfolders like
cache/simulation_v001/reallybigfiles_*.bgeo
cache/simulation_v002/reallybigfiles_*.bgeo
cache/simulation_v003/reallybigfiles_*.bgeo
and if a file existed like this- cache/simulation_v002/.protect
Then i'd like to build an rsync operation to move all folders to a temp /recycle location excluding cache/simulation_v002/ and all its contents.
I've done something like this before with python, but I'm curious to see if the operation can be simplified with rsync or another method.
rsyncalone can't do this - but you could usefindto construct an exclude file for rsync. e.g. starting with something likefind . -name .protect -printf '%h/***\n'./simulation_v002/***but this will then still end up including files it shouldn'trsync -a -m --remove-source-files --exclude-from='cache/exclude_list.txt' cache/ cache_trashis it possible for find to generatesimulation_v002/***instead?sed -e 's=^\./=='. don't expect one tool to do everything - it's normal to combine multiple small tools to achieve a desired result, each tool being good at its own job. find to get the list of files, sed to transform it into the required format, rsync to do th copy.