Cross-server file mirroring with minimal latency

Question

Platform: Ubuntu 10.04 x86.

We have a HTTP server (nginx, but that is not relevant) which serves some static content. Content is (rarely) uploaded by content-managers via SFTP, but may be changed / added my some other means (like a cat, made directly on server).

Now we want to add a second, identical HTTP server — a slave mirror in another data-center on another continent. (And setup DNS round-robin.)

What is the best way to set up synchronization between master server and a slave mirror, so that delay between modification and re-syncronization is minimal (a few seconds should be bearable though)?

The solution must cope with large changesets and race conditions. That is, if I change 1000 files, it should not spawn 1000 syncronization processes. And if I change something while synchronization is active, my new change must eventually make it to the server as well... And so on.

Rejected solutions:

CDN — does not worth the money for our particular usage scenario.
nfs — not over global internet.
dumb cron + rsync — latency and/or system load would be too large.
manual rsync — not reliable, content is changed by non-IT users.

I would say that we need something based on inotify. Is there a ready solution?

Update: two extra (rather obvious) requirements that I forgot to mention:

If data is somehow changed on the slave mirror (say, a superuser accidentally deleted a file), synch solution must restore data back to the master state on the next sync.
In idle state the solution must not consume traffic or system resources (other than some memory etc. for the sleeping daemon process of course).

Update 2: one more requirement:

The solution must work with UTF-8 file names.

@Mircea: please add lsyncd as a regular answer, so it can be upvoted / discussed properly. ;-) — Alexander Gladysh
– Alexander Gladysh, Commented Jun 7, 2011 at 20:03
Wait, seriously, how did you get the cat to make server content? Do you work for WikiHow or some other content farm? — flumignan
– flumignan, Commented Jun 7, 2011 at 23:36
Well, something like cat >crossdomain.xml and type a bit (or just paste into the terminal). It is a rare event, but it can happen, and synch solution must be ready. ;-) The point is that I can't use SFTP hook or something like this — multiple potential sources of changes. — Alexander Gladysh
– Alexander Gladysh, Commented Jun 8, 2011 at 6:22

Iliya Korshunov · Accepted Answer · 2011-06-07 22:30:25Z

1

What about pirsyncd? I think it`s good idea for you ;)

answered Jun 7, 2011 at 22:30

Iliya Korshunov

262 bronze badges

Add a comment |

ewwhite · Accepted Answer · 2011-06-07 19:57:51Z

Have you considered Unison as a means to keeping files in sync? Using it, you'd be able to do the one-way sync you're requesting. It seems like a reasonable fit for this application.

It does not work with UTF-8 file names. Added this requirement to the question. — Alexander Gladysh
– Alexander Gladysh, Commented Jun 7, 2011 at 22:34

Community · Accepted Answer · 2017-04-13 12:14:21Z

1

You could use lsyncd see: Is there a working Linux backup solution that uses inotify?

edited Apr 13, 2017 at 12:14

CommunityBot

1

answered Jun 7, 2011 at 21:44

Mircea Vutcovici

19.1k4 gold badges62 silver badges88 bronze badges

It does not restore files, deleted on mirror. (I've updated question to reflect this requirement.)

Alexander Gladysh
– Alexander Gladysh

2011-06-07 22:33:39 +00:00
Commented Jun 7, 2011 at 22:33
(Or maybe I'm missing something... Will try again.)

Alexander Gladysh
– Alexander Gladysh

2011-06-07 23:07:36 +00:00
Commented Jun 7, 2011 at 23:07

Add a comment |

slashdot · Accepted Answer · 2011-06-08 14:16:22Z

-2

Seems like this is where you might want to write a script that checks on timestamps of files and if the timestamp is later than last run of script, assume that file needs to be pushed, then trigger rsync or some other tool to synchronize the file. Likewise, on the other side, do the same thing with checking if a file has been changed, and if so, trigger a pull. Fabric might actually be a good tool for this. If you are familiar with Python, using fabric may be the way to go, in combination with timestamp checking.

answered Jun 8, 2011 at 14:16

slashdot

6515 silver badges7 bronze badges

Sorry, but (1) I explicitly said that the solution should be invoked automatically and (2) this task is too full of potential pitfalls (again, see the question for an incomplete list) to write a script by hand without trying existing solutions.

Alexander Gladysh
– Alexander Gladysh

2011-06-08 19:24:36 +00:00
Commented Jun 8, 2011 at 19:24
I personally do not believe this would take a lot of work to write, and without cron this could be run as a simple daemon, which is completely hands-off. This is a lightweight solution in general, and I would argue that it has advantages over other possible solutions. In fact, I had something similar to these requirements, and had implemented something similar to this. Process basically ran as a daemon, and after execution would go to sleep for a preset amount of time. There was a checker script run by puppet to make sure the process if it ever died would get restarted.

slashdot
– slashdot

2011-06-09 02:52:02 +00:00
Commented Jun 9, 2011 at 2:52

Add a comment |

Stack Exchange Network

Cross-server file mirroring with minimal latency

4 Answers 4

You must log in to answer this question.

Linked

Hot Network Questions

Cross-server file mirroring with minimal latency

4 Answers 4

You must log in to answer this question.

Linked

Related

Hot Network Questions