I have a huge table (around 10 million items). For simplicity, let's say it has only 2 columns: user_id and activity_id like this
user_id | activity_id --------------------- 1 | 1 1 | 2 1 | 3 2 | 1 2 | 2 I want to select all user_id with activity_id = 1, 2 NOT 3. In the case above it will be just one result: user_id = 2. I can do it using SELECT DISTINCT combined with INTERSECT and EXCEPT operators, but it seems to be extremely slow.
From what I know about databases, it can be improved with GIN and table partitioning, however I feel like it's not correct solution in the case of PostgreSQL (because subsets operators are slow by their own).
activity_id=4changes nothing in such a case. It's 10.000.000 rows.activity id = 1, 2do you mean 1 AND 2, or 1 OR 2?