I started by ensuring the planner had updated stats:
my_db=> vacuum analyze; VACUUM Time: 1401.958 ms When only selecting foos.bar_id, the query executes fine with an Index Only Scan on that column:
my_db=> EXPLAIN ANALYZE SELECT foos.bar_id FROM foos INNER JOIN bar_ids ON foos.bar_id = bar_ids.id; QUERY PLAN Nested Loop (cost=0.43..16203.46 rows=353198 width=4) (actual time=0.045..114.746 rows=196205 loops=1) -> Seq Scan on bar_ids (cost=0.00..16.71 rows=871 width=4) (actual time=0.005..0.195 rows=871 loops=1) -> Index Only Scan using index_foos_on_bar_id on foos (cost=0.43..14.80 rows=378 width=4) (actual time=0.003..0.055 rows=225 loops=871) Index Cond: (bar_id = bar_ids.id) Heap Fetches: 0 Planning time: 0.209 ms Execution time: 144.364 ms (7 rows) Time: 145.620 ms However, adding foos.id causes the query to choose an extremely slow Seq Scan:
my_db=> EXPLAIN ANALYZE SELECT foos.id, foos.bar_id FROM foos INNER JOIN bar_ids ON foos.bar_id = bar_ids.id; QUERY PLAN Hash Join (cost=27.60..221339.63 rows=353198 width=8) (actual time=294.091..3341.926 rows=196205 loops=1) Hash Cond: (foos.bar_id = bar_ids.id) -> Seq Scan on foos (cost=0.00..182314.70 rows=7093070 width=8) (actual time=0.004..1855.900 rows=7111807 loops=1) -> Hash (cost=16.71..16.71 rows=871 width=4) (actual time=0.454..0.454 rows=866 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 39kB -> Seq Scan on bar_ids (cost=0.00..16.71 rows=871 width=4) (actual time=0.002..0.222 rows=871 loops=1) Planning time: 0.237 ms Execution time: 3371.622 ms (8 rows) Time: 3373.150 ms Disabling Seq Scan forces an Index Scan on the same index, which is an order of magnitude faster than the Seq Scan:
my_db=> set enable_seqscan=false; SET Time: 0.801 ms my_db=> EXPLAIN ANALYZE SELECT foos.id, foos.bar_id FROM foos INNER JOIN bar_ids ON foos.bar_id = bar_ids.id; QUERY PLAN Nested Loop (cost=10000000000.43..10000439554.99 rows=353198 width=8) (actual time=0.028..171.632 rows=196205 loops=1) -> Seq Scan on bar_ids (cost=10000000000.00..10000000016.71 rows=871 width=4) (actual time=0.005..0.212 rows=871 loops=1) -> Index Scan using index_foos_on_bar_id on foos (cost=0.43..500.86 rows=378 width=8) (actual time=0.003..0.118 rows=225 loops=871) Index Cond: (bar_id = bar_ids.id) Planning time: 0.186 ms Execution time: 201.958 ms (6 rows) Time: 203.185 ms Other answers say the poor planning is due to bad statistics. My statistics are up to date. What gives?
bar_ids is a temporary table, which might be related to the insane cost estimates in the last query (Seq Scan on bar_ids (cost=10000000000.00..10000000016.71), but explicitly running ANALYZE bar_ids doesn't change the query plan.
vacuum analyze, should do this for all tables if I'm understanding the docs correctlyALTER TABLE bar_ids ADD PRIMARY KEY (id);???random_page_costcould likely use to be reduced, though that's more art than science. I'd try 2 and see if that tilts the scale in your favor - given everything else it may just be that the planner is reluctant to pick the index scan because of this. With good enough modern hardware the default of 4 is a bit cautious.