Alternatives to Pessimistic Locking in Cluster Applications

Question

I am researching alternatives to database-level pessimistic locking to achieve transaction isolation in a cluster of Java applications going against the same database. Synchronizing concurrent access in the application tier is clearly not a solution in the present configuration because the same database transaction can be invoked from multiple JVMs concurrently. Currently, we are subject to occasional race conditions which, due to the optimistic locking we have in place via Hibernate, cause a StaleObjectStateException exception and data loss.

I have a moderately large transaction within the scope of my refactoring project. Let's describe it as updating one top-level table row and then making various related inserts and/or updates to several of its child entities, each being one-to-many. I would like to insure exclusive access to the top-level table row and all of the children to be affected but I would like to stay away from pessimistic locking at the database level for performance reasons mostly.

Does it make sense to start a single (perhaps synchronous) message queue application into which this method could be moved to insure synchronized access as opposed to each cluster node using its own, which is a clear race condition hazard? I am mentioning this approach even though I am not confident in it because both the top-level table row and its children could also be updated from other system calls, not just the mentioned transaction. So I am seeking to design a solution where the top-level table row and its children will all somehow be pseudo-locked (exclusive transaction isolation) but at the application and not the database level.

I am open to ideas and suggestions, I understand this is not a very cut and dried challenge.

What did you end up doing? I have a similar situation in my project. — Downhillski
– Downhillski, Commented Jul 11, 2018 at 1:34

gnasher729 · Accepted Answer · 2025-02-10 09:42:23Z

You should not have any data loss.

The sequence is: Optimistic lock, several operations forming one transaction, unlock, a message that the transaction wasnt finished. Which means none of your operations is in the database.

At that point you should still know the purpose of each operation, figure out if these operations are still possible, and start all over again. Repeat until the transaction goes through or your checks show it cannot be performed (for example if you have six tyres two people can order four tyres, one transaction fails, and then you have only two tyres and nobody can buy four).

Assuming the transaction started with some user interface you may want to re-display the user interface.

This this a third time this month I see a first correct answer appear on a decade old question. Frustration. — Basilevs
– Basilevs, Commented Feb 10 at 9:52

Kain0_0 · Accepted Answer · 2025-02-05 08:33:06Z

This is a scheduling problem.

What each of your nodes wants to do is exclusively reserve an object without permanently locking the object, nor preventing it from participating in reads while some data-manipulation is being prepared behind the scenes.

The solution is a queue, but we don't have to off-shore the data-manipulation.

Make two tables. One is going to be the queue table and the other is going to be an ownership table.

The queue table is going to contain a task id, any task details, a node id, and the id of the object graph (top rows id?).

The ownership table is going to be a task id, the object graph id, a timestamp for expected task end.

There will need to be a constraint that a task is unique in the ownership table.

each node is going to write its own task into the task table. Depending on how the node architects it on that thread, or in a seperate thread pool we are going to poll for a task. The polling is going to start a transaction in tsql:

it select uncommitted from the ownership table

this picks up even in process transaction locks reducing contention
you'll have to ignore records whose timestamp is too old which removes perma locking from a node failure.

it selects with skipping locked pages from the task table so that we see all tasks without contention

the ownership select is used to exclude owned tasks, or tasks that share a object graph with one.
we will also filter out any task that can't be run on this node

we pick one of the tasks

we check the ownership table and delete any stale record for it
we insert a fresh record into the ownership table claiming the task and object graph.

Even though the transaction technically locks data in the database by doing it this way the only contestable lock is adding a new row to the ownership table that might be rejected if another poller is trying to pick up the same task (by a constraint between the tables), or by deleting a stale record (which should be uncontested as its very stale).

When we added the ownership record we set a timestamp that operates as a sunset on our right to modify. We should make sure that we don't take longer than this to process (or periodically update it when processing - like a heart beat).

At this point we own the task, and by extension the right to modify the data (for an amount of time). We process and update the graph. Once done we mark the task as complete (or delete it from the table) and delete the ownership (or update it to a new task).

If the task fails due to an optimistic lock issue, or something else it is fine. The task record still exists, it will become available once the ownership expires for reprocessing. So will the object graph we have locked will become available for updates. You might want to add some logging/optics to this so you can address repetitive failures, or if an object is receiving way more action than expected.

adramazany · Accepted Answer · 2016-02-26 08:18:55Z

I think you have some solutions for this with strengthnessesand weaknesses

using MQ (Message Queue) service with ability to account or ticket isolation
moving the update sql command to last ( last of possible location) (for example : doing update command after inserting child rows)
using transaction savepoints for shrink pessimestic lock time
update with version control

good luck

Stack Exchange Network

Alternatives to Pessimistic Locking in Cluster Applications

3 Answers 3

This is a scheduling problem.

Hot Network Questions

Alternatives to Pessimistic Locking in Cluster Applications

3 Answers 3

This is a scheduling problem.

Related

Hot Network Questions