Summary: I want to transfer a file which changes over time, probably faster than the available connection allows a successful transfer.
Setup:
- A remote device with bad network reliability and speed most of the time, but with "windows" of decent connection (able to scp ~>100 mb overnight).
- The device could spend anywhere from minutes to days (but usually not weeks) without reception.
- The device logs to a sqlite database with current date-time, several times a day.
- The db is around 5-10 mb, and I've found it can be compressed to about 10% of its original size experimentally.
- Database could be reset/deleted (low frequency, but happens, eg once per month or less) by a third party.
- I need to have a local copy of the database locally.
Current solution:
I have a script which does the following:
- Connect via ssh to the device.
- "snapshot" the db to a compressed file, unless a snapshot already exists.
rsync -ruvhPthe snapshot to a local drive (this starts a new transfer, or continues an interrupted one)untartransferred file to a sqlite file.- Read first date-time entry of db. Use it as a unique name to rename the local database.
- Delete remote snapshot (but not the local one). This allows to transfer new rows from the db (or the whole db if it was reset), but waste bandwidth "re-downloading" data we already have (the old rows if it wasn't reset). If the newly created snapshot is exactly like the old one, rsync does not waste bandwith. The script did create+destroy "uselessly" the snapshot though...
Problem: Depending on the number in the implementation above where the connection fails, there are lots of edge cases, producing unusable corrupted/incomplete db, etc. This is starting to look a lot like reinventing the wheel...
Questions:
- Are there libraries/solutions to this problem? I imagine services like dropbox and similar have situations like these solved, except for:
- Ideally, I would like to avoid installing services on the remote device.
- In the case the remote device "resets" the db, the new remote version would overwrite the local, without keeping the local(s) - At least in the dropbox case.
- Any alternatives patterns/better implementations to deal with edge cases?
- What about simpler storage alternatives, like CSV (to allow incremental transfers), or rsync --compressed?