I need to speed up the process of syncing log files between servers.
The machine that generates the logs (LOGMACHINE) creates them in a tree like this:
/Files /Files/LOGS1/ /Files/LOGS1/logFiles/ /Files/LOGS1/logFiles/typeLog1A /Files/LOGS1/logFiles/typeLog1B /Files/LOGS1/logFiles/typeLog1C /Files/LOGS1/logFiles/typeLog1C/fileLog1C-20210113-0900.xml.gz /Files/LOGS1/logFiles/typeLog1C/fileLog1C-20210113-0915.xml.gz /Files/LOGS1/logFiles/typeLog1C/fileLog1C-20210113-0930.xml.gz /Files/LOGS1/logFiles/typeLog2A /Files/LOGS1/logFiles/typeLog2A/fileLog2A-20210113-0900.xml.gz /Files/LOGS1/logFiles/typeLog2A/fileLog2A-20210113-0915.xml.gz /Files/LOGS1/logFiles/typeLog2A/fileLog2A-20210113-0930.xml.gz /Files/LOGS2/ /Files/LOGS2/logFiles/ /Files/LOGS2/logFiles/typeLog1A /Files/LOGS2/logFiles/typeLog1B /Files/LOGS2/logFiles/typeLog1C /Files/LOGS2/logFiles/typeLog1C/fileLog1C-20210113-0900.xml.gz /Files/LOGS2/logFiles/typeLog1C/fileLog1C-20210113-0915.xml.gz /Files/LOGS2/logFiles/typeLog1C/fileLog1C-20210113-0930.xml.gz /Files/LOGS2/logFiles/typeLog2A /Files/LOGS2/logFiles/typeLog2A/fileLog2A-20210113-0900.xml.gz /Files/LOGS2/logFiles/typeLog2A/fileLog2A-20210113-0915.xml.gz /Files/LOGS2/logFiles/typeLog2A/fileLog2A-20210113-0930.xml.gz There are around 4000 folders of typeLog1* and 9000 of typeLog2*. Each one has a new file each 15 minutes.
I own two servers, SERV1 that syncs the folders of typeLog1* and SERV2 for typeLog2*, both sync them from LOGMACHINE. Each severs sync both LOGS1 and LOGS2 folders.
Right now I'm using rsync that requires 30 mins only to get one of the LOGS folders. That creates a delay of 30 min - 1 hour approx to each file.
I made a solution to run several rsync in parallel. Unfourtnetly I can only have 8 ssh sessions in parallel, that's a limitation from the machine that creates the logs.
Limitations:
- I have to use
ssh - I can't install any software in the machine that creates the logs.
Is there any way to speed up the process, using rsync or an alternative?
Update:
Current rsyncs:
On SERV1:
rsync -avz --rsync-path=/usr/local/bin/rsync --ignore-existing --delete --files-from=<(ssh user@logmachine 'cd /home/user/Files/LOGS1/logFiles/; find . -mtime -1 -type f -name "*fileLog1*.xml.gz"') user@logmachine:/home/user/Files/LOGS1/logFiles/ Files/LOGS1/logFiles/ rsync -avz --rsync-path=/usr/local/bin/rsync --ignore-existing --delete --files-from=<(ssh user@logmachine 'cd /home/user/Files/LOGS2/logFiles/; find . -mtime -1 -type f -name "*fileLog1*.xml.gz"') user@logmachine:/home/user/Files/LOGS2/logFiles/ Files/LOGS2/logFiles/On SERV2:
rsync -avz --rsync-path=/usr/local/bin/rsync --ignore-existing --delete --files-from=<(ssh user@logmachine 'cd /home/user/Files/LOGS1/logFiles/; find . -mtime -1 -type f -name "*fileLog2*.xml.gz"') user@logmachine:/home/user/Files/LOGS1/logFiles/ Files/LOGS1/logFiles/ rsync -avz --rsync-path=/usr/local/bin/rsync --ignore-existing --delete --files-from=<(ssh user@logmachine 'cd /home/user/Files/LOGS2/logFiles/; find . -mtime -1 -type f -name "*fileLog2*.xml.gz"') user@logmachine:/home/user/Files/LOGS2/logFiles/ Files/LOGS2/logFiles/
The condition of finding a pattern in the file is necessary because there are other files in those folders.