0

Let's say I have a table with a primary key id and I added a nullable int called worker_id. I have a large amount of records so I run workers in parallel. First I run

SELECT id FROM foo WHERE worker_id IS NULL ... LIMIT 1000 

Then I retrieve these in my program and do

UPDATE foo WHERE id IN (1,2,3,4,...) AND worker_id IS NULL 

I am hoping by first selecting the primary keys it only needs to lock the specific records and nothing else. However, I am not sure whether this is enough to avoid any potential deadlocks. Is it? If no, how do I do avoid them?

1 Answer 1

0

No, the SELECT statement you show does not lock any rows. It's a non-locking read, because it does not include a locking clause (FOR UPDATE or FOR SHARE).

Even if you do use a locking read, you can still get a deadlock, because concurrent transactions may request locks in an interleaved fashion, and end up waiting for each other.

Read https://dev.mysql.com/doc/refman/en/innodb-locking-reads.html for more details on locking reads.

Basically, deadlocks may happen if concurrent sessions are doing updates against the same table.

You have these options to prevent deadlocks:

  • Do not allow updates from concurrent sessions. Do updates only from one session at a time.

  • Use pessimistic table locks or user locks, to ensure only one session at a time can run updates. But this tends to create a bottleneck if sessions are queued up waiting for the single lock.

Other than that, just write your client code to catch deadlock exceptions and retry.


Re your comment:

I can't tell from your queries what is the table definition or the EXPLAIN strategy for the UPDATE. It could lock non-atomically especially if the table has multiple unique keys. Or it could lock without the use of an index, so it has to fall back to locking every row in the table. Or the index it chooses could vary based on the optimizer's decision each time you run the UPDATE.

The bottom line is that it's hard to prevent deadlocks if you run concurrent sessions. You should always code defensively. That is, you should assume deadlocks will happen, and write code to handle them with retries.

3
  • I know this SELECT doesn't lock, I was curious about UPDATE. Commented Oct 10 at 2:09
  • @chx - One uses FOR UPDATE on the SELECT, thereby making the UPDATE statement "safe". Magic. (This requires that both the SELECT and UPDATE are in the same transaction (BEGIN...COMMIT). Commented Oct 10 at 21:17
  • @RickJames, Even then, one can get deadlocks, depending on the order of locks acquired in different sessions. Commented Oct 10 at 22:29

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.