0

I have a query that is strangely slow in Postgres 13 for a database containing only small amounts of data. I have even seen the problem in my test suite where I fabricate some fake data.

SELECT sales.* FROM sales INNER JOIN members ON members.id = sales.member_id INNER JOIN members_teams ON members_teams.member_id = members.id INNER JOIN teams ON teams.id = members_teams.team_id WHERE teams.id IN (1, 2) 

In my test suite I have the following counts of data in the different tables:

| Table | Count | | -------- | -------------- | | members | 501 | | teams | 3 | | members_teams | 501 | | sales | 502 | 

Here is an example of when it is slow:

Nested Loop (cost=0.75..25.83 rows=1 width=631) (actual time=38226.620..38226.622 rows=0 loops=1) Join Filter: (members_teams.team_id = teams.id) -> Nested Loop (cost=0.75..24.82 rows=1 width=635) (actual time=0.082..38220.385 rows=502 loops=1) Join Filter: (members.id = members_teams.member_id) Rows Removed by Join Filter: 251000 -> Index Scan using index_members_teams_on_team_id on members_teams (cost=0.25..8.26 rows=1 width=8) (actual time=0.031..0.544 rows=501 loops=1) -> Nested Loop (cost=0.50..16.54 rows=1 width=635) (actual time=0.014..76.217 rows=502 loops=501) Join Filter: (sales.member_id = members.id) Rows Removed by Join Filter: 125250 -> Index Scan using index_sales_on_member_id on sales (cost=0.25..8.26 rows=1 width=631) (actual time=0.005..0.262 rows=502 loops=501) -> Index Only Scan using members_pkey on members (cost=0.25..8.26 rows=1 width=4) (actual time=0.008..0.124 rows=251 loops=251502) Heap Fetches: 63001752 -> Seq Scan on teams (cost=0.00..1.00 rows=1 width=4) (actual time=0.005..0.005 rows=0 loops=502) Filter: (id = ANY ('{1,2}'::integer[])) Rows Removed by Filter: 3 Planning Time: 0.690 ms Execution Time: 38226.701 ms 

Here is an example of when it is a more normal speed:

Nested Loop (cost=0.75..24.82 rows=1 width=631) (actual time=224.746..224.747 rows=0 loops=1) Join Filter: (members.id = members_teams.member_id) -> Nested Loop (cost=0.50..16.54 rows=1 width=635) (actual time=0.047..80.953 rows=502 loops=1) Join Filter: (sales.member_id = members.id) Rows Removed by Join Filter: 125250 -> Index Scan using index_sales_on_member_id on sales (cost=0.25..8.26 rows=1 width=631) (actual time=0.015..0.367 rows=502 loops=1) -> Index Only Scan using members_pkey on members (cost=0.25..8.26 rows=1 width=4) (actual time=0.009..0.131 rows=251 loops=502) Heap Fetches: 125752 -> Index Only Scan using index_members_teams_on_member_id_and_team_id on members_teams (cost=0.25..8.27 rows=1 width=4) (actual time=0.286..0.286 rows=0 loops=502) Filter: (team_id = ANY (‘{1,2}’::integer[])) Rows Removed by Filter: 501 Heap Fetches: 251502 Planning Time: 0.481 ms Execution Time: 224.798 ms 

Summary

A key difference seems to be which index it uses for the join table members_teams. Do you have any suggestions for how I can make this consistently performant? I thought about removing the join to teams and filtering on the team_id on the join table, but I'm worried that in the future we may need to use this query with additional constraints from the teams table.

1 Answer 1

1

Your estimates seem completely off. Do you have autovacuum disabled, or is your statistics collector malfunctioning? You should get better plans by explicitly collecting statistics:

ANALYZE sales; ANALYZE members; ANALYZE members_teams; 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.