I have a query that is strangely slow in Postgres 13 for a database containing only small amounts of data. I have even seen the problem in my test suite where I fabricate some fake data.
SELECT sales.* FROM sales INNER JOIN members ON members.id = sales.member_id INNER JOIN members_teams ON members_teams.member_id = members.id INNER JOIN teams ON teams.id = members_teams.team_id WHERE teams.id IN (1, 2) In my test suite I have the following counts of data in the different tables:
| Table | Count | | -------- | -------------- | | members | 501 | | teams | 3 | | members_teams | 501 | | sales | 502 | Here is an example of when it is slow:
Nested Loop (cost=0.75..25.83 rows=1 width=631) (actual time=38226.620..38226.622 rows=0 loops=1) Join Filter: (members_teams.team_id = teams.id) -> Nested Loop (cost=0.75..24.82 rows=1 width=635) (actual time=0.082..38220.385 rows=502 loops=1) Join Filter: (members.id = members_teams.member_id) Rows Removed by Join Filter: 251000 -> Index Scan using index_members_teams_on_team_id on members_teams (cost=0.25..8.26 rows=1 width=8) (actual time=0.031..0.544 rows=501 loops=1) -> Nested Loop (cost=0.50..16.54 rows=1 width=635) (actual time=0.014..76.217 rows=502 loops=501) Join Filter: (sales.member_id = members.id) Rows Removed by Join Filter: 125250 -> Index Scan using index_sales_on_member_id on sales (cost=0.25..8.26 rows=1 width=631) (actual time=0.005..0.262 rows=502 loops=501) -> Index Only Scan using members_pkey on members (cost=0.25..8.26 rows=1 width=4) (actual time=0.008..0.124 rows=251 loops=251502) Heap Fetches: 63001752 -> Seq Scan on teams (cost=0.00..1.00 rows=1 width=4) (actual time=0.005..0.005 rows=0 loops=502) Filter: (id = ANY ('{1,2}'::integer[])) Rows Removed by Filter: 3 Planning Time: 0.690 ms Execution Time: 38226.701 ms Here is an example of when it is a more normal speed:
Nested Loop (cost=0.75..24.82 rows=1 width=631) (actual time=224.746..224.747 rows=0 loops=1) Join Filter: (members.id = members_teams.member_id) -> Nested Loop (cost=0.50..16.54 rows=1 width=635) (actual time=0.047..80.953 rows=502 loops=1) Join Filter: (sales.member_id = members.id) Rows Removed by Join Filter: 125250 -> Index Scan using index_sales_on_member_id on sales (cost=0.25..8.26 rows=1 width=631) (actual time=0.015..0.367 rows=502 loops=1) -> Index Only Scan using members_pkey on members (cost=0.25..8.26 rows=1 width=4) (actual time=0.009..0.131 rows=251 loops=502) Heap Fetches: 125752 -> Index Only Scan using index_members_teams_on_member_id_and_team_id on members_teams (cost=0.25..8.27 rows=1 width=4) (actual time=0.286..0.286 rows=0 loops=502) Filter: (team_id = ANY (‘{1,2}’::integer[])) Rows Removed by Filter: 501 Heap Fetches: 251502 Planning Time: 0.481 ms Execution Time: 224.798 ms Summary
A key difference seems to be which index it uses for the join table members_teams. Do you have any suggestions for how I can make this consistently performant? I thought about removing the join to teams and filtering on the team_id on the join table, but I'm worried that in the future we may need to use this query with additional constraints from the teams table.