Skip to main content
Became Hot Network Question
Markdown lists
Source Link
Greg Burghardt
  • 46.5k
  • 8
  • 87
  • 150

To achieve exactly-once processing where messages are consumed from a queue with at-least-once delivery, many sources (e.g. here and here and here) suggest attaching a unique ID to messages in the producer, which can then be used by consumers to deduplicate messages.

I'm curious about how this works in practice. Consider a message consumer responsible for issuing refunds, where it's critical to ensure each refund is processed exactly once. If we use a database to manage deduplication, the consumer implementation might resemble the following pseudocode :

1. Receive refund message 2. Insert DB record with key message.unique_id 3. If insertion fails due to unique constraint: 3.1. Ignore this message and exit 4. Call payment gateway API to issue refund 
  1. Receive refund message
  2. Insert DB record with key message.unique_id
  3. If insertion fails due to unique constraint:
    1. Ignore this message and exit
  4. Call payment gateway API to issue refund

A problem arises if the consumer crashes between steps 3 and 4. The refund won't be processed despite retrying since the DB record prevents reprocessing, leading to at-most-once processing.

An alternative could involve a transaction to ensure the DB insert rolls back if the refund fails:

1. Receive refund message 2. Start DB write transaction 3. Insert DB record with key message.unique_id 4. If insertion fails due to unique constraint: 4.1. Ignore this message and exit 5. Call payment gateway API to issue refund 6. Commit transaction 
  1. Receive refund message
  2. Start DB write transaction
  3. Insert DB record with key message.unique_id
  4. If insertion fails due to unique constraint:
    1. Ignore this message and exit
  5. Call payment gateway API to issue refund
  6. Commit transaction

But now if the consumer crashes between steps 5 and 6, it will cause a duplicate refund upon retry, leading to at-least-once processing.

How is it possible to achieve exactly-once processing in this scenario?

To achieve exactly-once processing where messages are consumed from a queue with at-least-once delivery, many sources (e.g. here and here and here) suggest attaching a unique ID to messages in the producer, which can then be used by consumers to deduplicate messages.

I'm curious about how this works in practice. Consider a message consumer responsible for issuing refunds, where it's critical to ensure each refund is processed exactly once. If we use a database to manage deduplication, the consumer implementation might resemble the following pseudocode :

1. Receive refund message 2. Insert DB record with key message.unique_id 3. If insertion fails due to unique constraint: 3.1. Ignore this message and exit 4. Call payment gateway API to issue refund 

A problem arises if the consumer crashes between steps 3 and 4. The refund won't be processed despite retrying since the DB record prevents reprocessing, leading to at-most-once processing.

An alternative could involve a transaction to ensure the DB insert rolls back if the refund fails:

1. Receive refund message 2. Start DB write transaction 3. Insert DB record with key message.unique_id 4. If insertion fails due to unique constraint: 4.1. Ignore this message and exit 5. Call payment gateway API to issue refund 6. Commit transaction 

But now if the consumer crashes between steps 5 and 6, it will cause a duplicate refund upon retry, leading to at-least-once processing.

How is it possible to achieve exactly-once processing in this scenario?

To achieve exactly-once processing where messages are consumed from a queue with at-least-once delivery, many sources (e.g. here and here and here) suggest attaching a unique ID to messages in the producer, which can then be used by consumers to deduplicate messages.

I'm curious about how this works in practice. Consider a message consumer responsible for issuing refunds, where it's critical to ensure each refund is processed exactly once. If we use a database to manage deduplication, the consumer implementation might resemble the following pseudocode :

  1. Receive refund message
  2. Insert DB record with key message.unique_id
  3. If insertion fails due to unique constraint:
    1. Ignore this message and exit
  4. Call payment gateway API to issue refund

A problem arises if the consumer crashes between steps 3 and 4. The refund won't be processed despite retrying since the DB record prevents reprocessing, leading to at-most-once processing.

An alternative could involve a transaction to ensure the DB insert rolls back if the refund fails:

  1. Receive refund message
  2. Start DB write transaction
  3. Insert DB record with key message.unique_id
  4. If insertion fails due to unique constraint:
    1. Ignore this message and exit
  5. Call payment gateway API to issue refund
  6. Commit transaction

But now if the consumer crashes between steps 5 and 6, it will cause a duplicate refund upon retry, leading to at-least-once processing.

How is it possible to achieve exactly-once processing in this scenario?

link to sources
Source Link
del
  • 139
  • 2

To achieve exactly-once processing where messages are consumed from a queue with at-least-once delivery, many sources (e.g. examplehere and here and here) suggest attaching a unique ID to messages in the producer, which can then be used by consumers to deduplicate messages.

I'm curious about how this works in practice. Consider a message consumer responsible for issuing refunds, where it's critical to ensure each refund is processed exactly once. If we use a database to manage deduplication, the consumer implementation might resemble the following pseudocode :

1. Receive refund message 2. Insert DB record with key message.unique_id 3. If insertion fails due to unique constraint: 3.1. Ignore this message and exit 4. Call payment gateway API to issue refund 

A problem arises if the consumer crashes between steps 3 and 4. The refund won't be processed despite retrying since the DB record prevents reprocessing, leading to at-most-once processing.

An alternative could involve a transaction to ensure the DB insert rolls back if the refund fails:

1. Receive refund message 2. Start DB write transaction 3. Insert DB record with key message.unique_id 4. If insertion fails due to unique constraint: 4.1. Ignore this message and exit 5. Call payment gateway API to issue refund 6. Commit transaction 

But now if the consumer crashes between steps 5 and 6, it will cause a duplicate refund upon retry, leading to at-least-once processing.

How is it possible to achieve exactly-once processing in this scenario?

To achieve exactly-once processing where messages are consumed from a queue with at-least-once delivery, many sources (example) suggest attaching a unique ID to messages in the producer, which can then be used by consumers to deduplicate messages.

I'm curious about how this works in practice. Consider a message consumer responsible for issuing refunds, where it's critical to ensure each refund is processed exactly once. If we use a database to manage deduplication, the consumer implementation might resemble the following pseudocode :

1. Receive refund message 2. Insert DB record with key message.unique_id 3. If insertion fails due to unique constraint: 3.1. Ignore this message and exit 4. Call payment gateway API to issue refund 

A problem arises if the consumer crashes between steps 3 and 4. The refund won't be processed despite retrying since the DB record prevents reprocessing, leading to at-most-once processing.

An alternative could involve a transaction to ensure the DB insert rolls back if the refund fails:

1. Receive refund message 2. Start DB write transaction 3. Insert DB record with key message.unique_id 4. If insertion fails due to unique constraint: 4.1. Ignore this message and exit 5. Call payment gateway API to issue refund 6. Commit transaction 

But now if the consumer crashes between steps 5 and 6, it will cause a duplicate refund upon retry, leading to at-least-once processing.

How is it possible to achieve exactly-once processing in this scenario?

To achieve exactly-once processing where messages are consumed from a queue with at-least-once delivery, many sources (e.g. here and here and here) suggest attaching a unique ID to messages in the producer, which can then be used by consumers to deduplicate messages.

I'm curious about how this works in practice. Consider a message consumer responsible for issuing refunds, where it's critical to ensure each refund is processed exactly once. If we use a database to manage deduplication, the consumer implementation might resemble the following pseudocode :

1. Receive refund message 2. Insert DB record with key message.unique_id 3. If insertion fails due to unique constraint: 3.1. Ignore this message and exit 4. Call payment gateway API to issue refund 

A problem arises if the consumer crashes between steps 3 and 4. The refund won't be processed despite retrying since the DB record prevents reprocessing, leading to at-most-once processing.

An alternative could involve a transaction to ensure the DB insert rolls back if the refund fails:

1. Receive refund message 2. Start DB write transaction 3. Insert DB record with key message.unique_id 4. If insertion fails due to unique constraint: 4.1. Ignore this message and exit 5. Call payment gateway API to issue refund 6. Commit transaction 

But now if the consumer crashes between steps 5 and 6, it will cause a duplicate refund upon retry, leading to at-least-once processing.

How is it possible to achieve exactly-once processing in this scenario?

Source Link
del
  • 139
  • 2

Design question for exactly-once processing in a message-driven system using a unique ID

To achieve exactly-once processing where messages are consumed from a queue with at-least-once delivery, many sources (example) suggest attaching a unique ID to messages in the producer, which can then be used by consumers to deduplicate messages.

I'm curious about how this works in practice. Consider a message consumer responsible for issuing refunds, where it's critical to ensure each refund is processed exactly once. If we use a database to manage deduplication, the consumer implementation might resemble the following pseudocode :

1. Receive refund message 2. Insert DB record with key message.unique_id 3. If insertion fails due to unique constraint: 3.1. Ignore this message and exit 4. Call payment gateway API to issue refund 

A problem arises if the consumer crashes between steps 3 and 4. The refund won't be processed despite retrying since the DB record prevents reprocessing, leading to at-most-once processing.

An alternative could involve a transaction to ensure the DB insert rolls back if the refund fails:

1. Receive refund message 2. Start DB write transaction 3. Insert DB record with key message.unique_id 4. If insertion fails due to unique constraint: 4.1. Ignore this message and exit 5. Call payment gateway API to issue refund 6. Commit transaction 

But now if the consumer crashes between steps 5 and 6, it will cause a duplicate refund upon retry, leading to at-least-once processing.

How is it possible to achieve exactly-once processing in this scenario?