Data replication via message broker or built-in replication feature?

Question

I'm working on a microservice system where 8 services share the same data - most notably: auth, forum, portal.

Essentially, forum and portal use users from auth, and since I went with the 'AP' approach in CAP Theorem - I duplicate data from users and store it in database of forum and portal whenever it changes - thus removing SPoF if auth dies.

Now, currently I do it by using a message broker (rabbitmq) to listen to UserCreated, UserDeleted and UserUpdated events - from which I get the data.

Should I be doing it this way, or should I use a replication feature built into Postgres? I imagine that my approach allows for replication across multiple database types, be it: MS SQL, Postgres, MySQL, NoSQL or NewSQL solutions.

Having worked in a system which used a message broker for database replication, I'd warn about first considering KISS, YAGNI and DRY, because you could very well end up with a system which violates all three of these. In this particular case, the database replication ended up being much larger and more complex than the rest of the system combined; it had dozens of extra microservices and infrastructure to provide sufficient resilience, reliability, recovery and observability, which caused a massive amount of pain diagnosing and fixing live environment issues. — Ben Cottrell
– Ben Cottrell, Commented Nov 8, 2020 at 12:19
@BenCottrell Hmm, yeah, that sounds like a burden. Thought, I prefer replication over aggregation services. — JTinkers
– JTinkers, Commented Nov 8, 2020 at 20:38

Michael · Accepted Answer · 2020-11-08 00:48:06Z

The main difference between the two is that messaging allows you to apply transformations and enrichments to the data being replicated whereas straight up database replication requires the database schema to be identical (or extremely close, depending on the software).

If you can use database replication, it might be time to ask why you are looking to split up the database in the first place. You may not need to have separate data stores and your architecture might be simplified by using a single database across microservices. I am aware of the literature indicating this to be a Bad Thing (TM), but this is written from the perspective of larger systems with the need to individually scale components. If this does not apply to you, a single database (or replica set) might just make sense.

If you have very different perspectives on the data and are transforming it for faster access in the other microservices, messaging is the preferred choice. I would definitely avoid using messaging to create an exact replica as it would be work without value.

I can't use the same database, because forum service is already used 17 times more heavily than portal or auth. The user structure will be different in most services, mainly with less (but not additional) columns there and there. — JTinkers
– JTinkers, Commented Nov 8, 2020 at 1:08
I get that. Obviously, I can't see the monitoring data, I just mention it to call out the possibility :). Assuming you need to split it and the difference is purely subtractive (i.e. fewer columns needed in one place than another), I'd probably start with database replication rather than messaging. The costs of storing a few extra columns would generally be much lower than the cost of hand developing a solution to replicate a smaller amount of data. If you start to see more aggregates, then messaging is an excellent solution. — Michael
– Michael, Commented Nov 8, 2020 at 1:14
Yes, what you say makes a lot of sense. I feel like messaging is a bit more future-proof. Though, I've just realized that certain data is meant to be deleted if user is removed, stuff like forum threads, posts etc. - I feel like messaging is better in this case, as opposed to running a background job that checks for replication changes. — JTinkers
– JTinkers, Commented Nov 8, 2020 at 1:32
Agreed. Any tasks that are triggered by discrete events are excellent places for messaging. — Michael
– Michael, Commented Nov 8, 2020 at 2:05

Stack Exchange Network

Data replication via message broker or built-in replication feature?

1 Answer 1

Hot Network Questions

Data replication via message broker or built-in replication feature?

1 Answer 1

Related

Hot Network Questions