Is there a way to use both an index for join and a separate index for a different where clause? Whenever I add the join query- it causes the where index to be ignored.
I have Users table (80,000 records)
| id | company_id |
|---|---|
| 1 | 4 |
| 2 | 4 |
| 3 | 5 |
with btree index on company_id: PRIMARY KEY, btree (id) Index btree on company_id
Comments table (7,000,000 records)
| id | user_id |
|---|---|
| 11 | 2 |
| 22 | 2 |
| 33 | 3 |
with btree index on user_id:
PRIMARY KEY, btree (id)Index btree on user_id
If I use a simple where - it uses the company index:
SELECT users.id FROM users WHERE users.company_id IN (16111, 16198, 16113, 16133, 16135, 16134, 16136, 16137, 16138, 16139, 16141, 16142, 16143, 16144, 16145, 16146, 16147, 16333) > Index Scan using index_users_on_company_id on users (cost=0.29..213.88 rows=97 width=4) Index Cond: (company_id = ANY ('{16111,16198,16113,16133,16135,16134,16136,16137,16138,16139,16141,16142,16143,16144,16145,16146,16147,16333}'::integer[])) But when I add, although it uses user_id index at comments, it stopped using the company_id index (Parallel Seq Scan instead of previously used Index Scan):
SELECT users.id FROM users INNER JOIN comments ON comments.user_id = users.id WHERE users.company_id IN (16111, 16198, 16113, 16133, 16135, 16134, 16136, 16137, 16138, 16139, 16141, 16142, 16143, 16144, 16145, 16146, 16147, 16333) Gather (cost=1000.43..9632.88 rows=1410 width=4) Workers Planned: 1 -> Nested Loop (cost=0.43..8491.88 rows=829 width=4) -> Parallel Seq Scan on users (cost=0.00..2742.06 rows=57 width=4) Filter: (company_id = ANY ('{16111,16198,16113,16133,16135,16134,16136,16137,16138,16139,16141,16142,16143,16144,16145,16146,16147,16333}'::integer[])) -> Index Only Scan using index_comments_on_user_id on comments (cost=0.43..98.21 rows=266 width=4) Index Cond: (user_id = users.id) To summarize:
Is there a way to use both the company_id index and the user_id index in a single query?
Basically substitute the Parallel Seq Scan on users with Index Scan using index_users_on_company_id on users at the second query?
UPDATE:
After setting max_parallel_workers_per_gather to 0, as @Anton Grig suggested, The same query, uses index instead of Parallel Seq Scan It does performs faster. I wonder if there's a less intrusive way..
Nested Loop (cost=0.72..9998.67 rows=1410 width=4) -> Index Scan using index_users_on_company_id on users (cost=0.29..213.88 rows=97 width=4) Index Cond: (company_id = ANY ('{16111,16198,16113,16133,16135,16134,16136,16137,16138,16139,16141,16142,16143,16144,16145,16146,16147,16333}'::integer[])) -> Index Only Scan using index_comments_on_user_id on comments (cost=0.43..98.21 rows=266 width=4) Index Cond: (user_id = users.id) UPDATE2:
I'll add the full query (uses left outer join) I'm running, not a minified version for question
SELECT COUNT(DISTINCT comments.tag_id) FILTER (WHERE comments.archived = FALSE ) AS tags_count FROM users LEFT OUTER JOIN comments ON comments.user_id = users.id WHERE users.company_id IN (16111, 16198, 16113, 16133, 16135, 16134, 16136, 16137, 16138, 16139, 16141, 16142, 16143, 16144, 16145, 16146, 16147, 16333) GROUP BY users.id
max_parallel_workers_per_gatherto 0