Unique constraint on different columns

Question

I'm trying to make a table to store the edges of a directed graph, but want to avoid duplicate edges in both directions. I can of course make it one direction :

CREATE TABLE edge (parent integer, child integer, UNIQUE(parent, child));

But how can I make sure that I also don't allow a new (parent, child) that already exists as (child, parent) ?

Any ideas would be greatly appreciated !

Evan Carroll · Accepted Answer · 2017-10-06 03:22:36Z

Using `GREATEST` and `LEAST`

Essentially, if we can sort of the columns in the index, we can ensure that the duplicate entry causes a collision and a constraint violation this is because sort(5,2) = sort(2,5). Because there is no sort, we use greatest and least.

CREATE TABLE edge ( parent int, child int ); CREATE UNIQUE INDEX ON edge ( greatest(parent,child), least(parent,child) ); INSERT INTO edge(parent,child) VALUES (42,7), (42,9), (5,7); INSERT INTO edge(parent,child) VALUES (7,42); ERROR: duplicate key value violates unique constraint "edge_greatest_least_idx" DETAIL: Key ((GREATEST(parent, child)), (LEAST(parent, child)))=(42, 7) already exists.

Thanks ! the greatest / least was the thing I was apparently too stupid for. — Johannes
– Johannes, Commented Oct 4, 2017 at 17:53
@EvanCarroll - check out my answer! What's your assessment of performance issues? — Vérace
– Vérace, Commented Oct 6, 2017 at 2:59

Vérace · Accepted Answer · 2017-10-06 10:15:23Z

0

You can also do this: (for positive INTEGERs db-fiddle)

CREATE TABLE edge ( parent int NOT NULL, child int NOT NULL ); CREATE UNIQUE INDEX ed_uq_ix ON edge (ABS(parent - child), (parent * child));

Non conflicting records:

INSERT INTO edge(parent,child) VALUES (42,7), (42,9), (5,7); -- succeeds

BUT:

INSERT INTO edge VALUES (9, 42); -- fails!

edited Oct 6, 2017 at 10:15

answered Oct 6, 2017 at 2:58

Vérace

31k9 gold badges73 silver badges86 bronze badges

1

I'm not sure it's faster -- and I'm not sure the speed matters on this kind of operation, but it won't work on more than 100k parent/child rows without growing to bigint. For instance, run your create DDL and then run INSERT INTO edge(parent,child) VALUES (1e5, 1e5); My example works up until 2,147,483,647 (>1e9) which is the full range of ints.

Evan Carroll
– Evan Carroll

2017-10-06 03:23:58 +00:00
Commented Oct 6, 2017 at 3:23
Won't PostgreSQL just CAST anyway? I'm not sure any system would cope with 2^32 edges?

Vérace
– Vérace

2017-10-06 03:28:22 +00:00
Commented Oct 6, 2017 at 3:28
1

I think you're missing what I'm saying. In a normal serial column, you can handle 2^31-1 rows and the primary key can handle 2^32 rows. Both of those are usually far more rows than needed. In your index, you can only handle 46340 rows. Why? Because your algorithm is p*c assuming p=c, you can exhaust that range in 46,341. I find it possible to have more than 46,341 edges. So the difference between your range and my range is (2^31-1) - sqrt(2^31-1) you have to value the performance difference, if anything, that much; if you can only support 46,341 rows - does it matter?

Evan Carroll
– Evan Carroll

2017-10-06 04:34:21 +00:00
Commented Oct 6, 2017 at 4:34
Besides the other reasons - already mentioned by Evan and a_horse_with_no_name - this has an issue with negative ids as well. It doesn't allow both (-5,1) and (-1,5) for example.

ypercubeᵀᴹ
– ypercubeᵀᴹ

2017-10-06 07:34:39 +00:00
Commented Oct 6, 2017 at 7:34
@ypercubeᵀᴹ edited to deal with your concerns! Hadn't thought of that!

Vérace
– Vérace

2017-10-06 10:14:17 +00:00
Commented Oct 6, 2017 at 10:14

| Show 1 more comment

Stack Exchange Network

Unique constraint on different columns

2 Answers 2

Using `GREATEST` and `LEAST`

Hot Network Questions

Unique constraint on different columns

2 Answers 2

Using GREATEST and LEAST

Related

Hot Network Questions

Using `GREATEST` and `LEAST`