I am attempting to run an Postgres UPSERT in fairly concurrent code. I have been seeing a duplicate key value violates unique constraint error when my service is under load.
By way of a reduced example, I have the following table, which has an id primary key and an id_secondary unique key (meant to function as a second ID)
CREATE TABLE "test_table" ( "id" text NOT NULL, "id_secondary" text NOT NULL, "value" bool NOT NULL DEFAULT 'FALSE', PRIMARY KEY ("id") ); CREATE UNIQUE INDEX "test_table_uc" ON "public"."test_table" USING BTREE ("id_secondary"); I am running the following two statements simultaneously:
INSERT INTO "test_table" ("id", "id_secondary", "value") VALUES('1', '1', 'false') ON CONFLICT ("id") DO UPDATE SET "value" = "excluded"."value"; INSERT INTO "test_table" ("id", "id_secondary", "value") VALUES('1', '1', 'true') ON CONFLICT ("id") DO UPDATE SET "value" = "excluded"."value"; and it is my expectation that whichever statement executes last will be the one to set the value of "value".
Occasionally I am receiving the error duplicate key value violates unique constraint "test_table_uc" from one of these statements. My full test setup in Go (using testify) is:
suite.Run("Concurrency", func() { db, err := sql.Open("postgres", "host=localhost port=35432 dbname=test sslmode=disable user=postgres") db.SetMaxOpenConns(10) suite.Require().NoError(err) for ii := 0; ii < 100; ii++ { wg := sync.WaitGroup{} wg.Add(2) go func() { _, err := db.Exec(` INSERT INTO "test_table" ("id", "id_secondary", "value") VALUES($1, $2, 'false') ON CONFLICT ("id") DO UPDATE SET "value" = "excluded"."value" `, ii, ii) suite.Require().NoError(err) wg.Done() }() go func() { _, err := db.Exec(` INSERT INTO "test_table" ("id", "id_secondary", "value") VALUES($1, $2, 'true') ON CONFLICT ("id") DO UPDATE SET "value" = "excluded"."value" `, ii, ii) suite.Require().NoError(err) wg.Done() }() wg.Wait() } }) Where I am allowing concurrent SQL connections using db.SetMaxOpenConns(10) and the go func() syntax is running both SQL statements in different goroutines (threads). Only two of the statements are run at a time, and the value of "id" and "id_secondary" are always set to the loop iteration. I wait for both statements to return before iterating.
It generally fails with the duplicate key value violates unique constraint error within a few loops.
I have not modified my isolation settings and I am running in SHOW default_transaction_isolation = read committed, on PostgreSQL 9.6.24
My Question
At no point is the statement generating a row that does not satisfy the unique constraints of the table. Seemingly, both statements are attempting to insert data at the same time. The Postgres documentation states that:
ON CONFLICT DO UPDATE guarantees an atomic INSERT or UPDATE outcome; provided there is no independent error, one of those two outcomes is guaranteed, even under high concurrency. This is also known as UPSERT — “UPDATE or INSERT”. which I interpreting to mean that the INSERT or UPDATE happens atomically, they should never attempt to write in data with the same primary key.
For the test_table_uc constraint to be violated, then in my example, the primary key uniqueness constraint must also be breaking. This is very surprising to me.
What am I missing here?