-1

I find a lot of resources regarding strong consistency versus eventual consistency, and when to choose one or the other.

I'm wondering how and when you would have both together. Consider this specific scenario:

  1. Order service receives and stores order to database.
  2. Order service sends OrderReceived message on the message bus.
  3. Another service (or even another aggregate in Order service) receives message and performs an operation.

We really don't want to store the order successfully then fail to deliver message on bus, or we risk downstream services never receiving the update, right?

Am I wrong in thinking that operations 1 and 2 should almost always (there are probably exceptions to this case) be transactionally consistent?

I'm asking because my teammates and I are inexperienced in this area, and we have services that do not put operations 1 and 2 in transactions.

EDIT: More generally speaking, if you have this saga scenario:

  1. Service A
    1. Performs operation A successfully
    2. Communicates update with message A
  2. Service B (after receiving message A)
    1. Performs operation B successfully
    2. Communicates update with message B
  3. Service C (after receiving message B)
    1. Fails to perform operation C
    2. Communicates failure
  4. Service A and B performs compensating actions

My assumption is that while the entire workflow must be eventually consistent, the internal/local operations described (the 1s and 2s) must be transactionally consistent.

2
  • 1
    If the OrderReceived message is what triggers the processing of the order, then it certainly would appear that you wouldn't want to store the Order (and confirm such to the orderer) without a guarantee that the OrderReceived message has also been generated. In turn, you wouldn't want the OrderReceived message to be confirmed as broadcast, without confirmation that it is received by, and safely under the control of, the relevant service that processes it. Commented Jan 4, 2022 at 10:01
  • 2
    Your real misconception is that there is general answer to the question, outside of any domain or business context. You can always construct real life cases where those operations must be transactually consistent, and cases where eventuall consistency is completely sufficient. Commented Jan 4, 2022 at 11:53

2 Answers 2

2

In general, using transactional consistency would require a central authority, like a database. So you could perform operation A B and C in a single database transaction and ensure that either all succeed or all fail.

If this is not possible, due to having distributed services for example, you are forced relax the consistency requirement. For example, if operation C fails you might have a special command to revert the A and B operations. But these can also fail, so you need to be able to deal with the situation where A and B have been done, but C has failed.

One way to do this would be to store the C operation in a database, so if the service C is down, it can resume operation and process all the operations it have missed. This presumes that it can actually do all these operations, if it can fail for other reasons you will need a way to deal with this.

For a real life scenario, consider a webshop. When a order is received all the information is validated, the order is stored in a database, and sent to the warehouse for dispatching. If the warehouse is shut down for some reason you still want to accept the order so it can be sent for dispatching when the warehouse is online again. But there is always the chance the order cannot be fulfilled, for example if the stock database is incorrect, so you will need a process to handle cases like this, but this can be presumed to be fairly rare, so might be done manually.

1

The general solution to this is to have a clean up check which runs periodically.

Service C: Perform Operation C Close A (via message to A?) Service D: Look for all open Operation A's Check related operation B and C have successfully completed within allowed time span. If not. Flag for intervention/Take remedial action 

This allows for retries and delays in the normal process.

I think I would note that you would hope that your normal process worked 99.999999% of the time and you had nice clean error logs.

Then say AWS goes down or the council dig through your internet connection or whatever, suddenly you get errors in your logs and your alerts start firing.

You read the messages, run out and splice in some new fibre and your services automatically pick up the errored MQ messages, retry and everything is back to normal again.

"Service D" is fulfilled with your run of the mill error logging and alerting suite + you running out with a spanner when alerted. If you find yourself making a robot that runs out with a spanner, then your services A, B and C probably aren't as good as they should be.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.