Why is the wrong index used for this query?

Question

I have a problem currently with index usage of Postgresql on two different hosts (local & remote) for the same query. The query in question is the following:

SELECT COUNT(*) FROM ( SELECT 1 AS one FROM "big_table" WHERE "big_table"."user_id" = 13 AND "big_table"."action" = 1 AND (big_table.created_at >= '2018-12-09 23:00:00'::timestamp without time zone) ORDER BY big_table.created_at desc LIMIT 15 OFFSET 10 ) subquery_for_count;

It's out of scope for me to change this query, since it's generated by a library we are using, so I'd like to find a solution without having to change it. If I run the EXPLAIN command locally with said query my Postgres instance outputs the following:

local_host=# EXPLAIN SELECT COUNT(*) FROM (SELECT 1 AS one FROM "big_table" WHERE "big_table"."user_id" = 13 AND "big_table"."action" = 1 AND (big_table.created_at >= '2018-12-09 23:00:00'::timestamp without time zone) ORDER BY big_table.created_at desc LIMIT 15 OFFSET 10) subquery_for_count; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=8.59..8.60 rows=1 width=8) -> Limit (cost=8.57..8.58 rows=1 width=12) -> Sort (cost=8.57..8.57 rows=1 width=12) Sort Key: big_table.created_at DESC -> Index Scan using big_table_idx_user_action_transfers on big_table (cost=0.56..8.56 rows=1 width=12) Index Cond: ((user_id = 13) AND (action = 1)) Filter: (created_at >= '2018-12-09 23:00:00'::timestamp without time zone) (7 rows)

This is fine, it (partially) uses the compound index on user_id and action as expected. However, if I run the query on the remote system I get the following EXPLAIN output:

remote_host=# EXPLAIN SELECT COUNT(*) FROM (SELECT 1 AS one FROM "big_table" WHERE "big_table"."user_id" = 13 AND "big_table"."action" = 1 AND (big_table.created_at >= '2018-12-09 23:00:00'::timestamp without time zone) ORDER BY big_table.created_at desc LIMIT 15 OFFSET 10) subquery_for_count; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=8472.67..8472.68 rows=1 width=8) -> Limit (cost=3389.25..8472.48 rows=15 width=12) -> Index Scan Backward using index_big_table_on_created_at on big_table (cost=0.44..4492554.51 rows=13257 width=12) Index Cond: (created_at >= '2018-12-09 23:00:00'::timestamp without time zone) Filter: ((user_id = 13) AND (action = 1)) (5 rows)

As one can see, on the remote host the database uses the index on created_at, rather than user_id and action as my local installation. This leads to this query being reeeeeeeally slow on the remote host (> 1min to complete), because there are a lot of entries which fulfill the index condition, and filtering all of those takes a lot of time. But, on my local installation it's quite quick (~1s to complete). Both my local and the remote tables have the same number of entries (~25mio.) and about the same data distribution. We run the Vacuum daemon on the remote host, so VACUUM ANALYZE is done quite frequently. Also, the indices are setup exactly the same on both systems.

I have already tried to search for solutions for this problem, but I haven't found anything useful until now apart from running VACUUM ANALYZE and making sure indices for the related attributes exist.

Maybe one of you guys has a clue? Of course I could add a compound index for all used attributes (user_id, action and created_at), but I'm still heavily confused as to why the "correct" index is not used in this case on the remote host.

Both hosts use the version 9.6 of Postgres (9.6.9 on the local host and 9.6.17 on the remote host to be exact).

Edit #1: Here's the output of the query plan with (ANALYZE, BUFFERS) set on the production host:

remote_host=# EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM (SELECT 1 AS one FROM "big_table" WHERE "big_table"."user_id" = 874 AND "big_table"."action" = 1 AND (big_table.created_at >= '2018-12-09 23:00:00') ORDER BY big_table.created_at desc LIMIT 15 OFFSET 10) subquery_for_count; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=8539.91..8539.92 rows=1 width=8) (actual time=206301.833..206301.833 rows=1 loops=1) Buffers: shared hit=16082155 read=550889 dirtied=3 -> Limit (cost=3416.15..8539.73 rows=15 width=12) (actual time=206301.826..206301.826 rows=0 loops=1) Buffers: shared hit=16082155 read=550889 dirtied=3 -> Index Scan Backward using index_big_table_on_created_at on big_table (cost=0.44..4515235.12 rows=13219 width=12) (actual time=67472.247..206301.821 rows=2 loops=1) Index Cond: (created_at >= '2018-12-09 23:00:00'::timestamp without time zone) Filter: ((user_id = 874) AND (action = 1)) Rows Removed by Filter: 26362926 Buffers: shared hit=16082155 read=550889 dirtied=3 Planning time: 2.003 ms Execution time: 206302.505 ms (11 rows)

Edit #2: Here's the output of the query plan with (ANALYZE, BUFFERS) set on the local host:

local_host=# EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM (SELECT 1 AS one FROM "big_table" WHERE "big_table"."user_id" = 874 AND "big_table"."action" = 1 AND (big_table.created_at >= '2018-12-09 23:00:00') ORDER BY big_table.created_at desc LIMIT 15 OFFSET 10) subquery_for_count; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=6.37..6.38 rows=1 width=8) (actual time=0.046..0.046 rows=1 loops=1) Buffers: shared hit=4 -> Limit (cost=6.35..6.35 rows=1 width=12) (actual time=0.043..0.043 rows=0 loops=1) Buffers: shared hit=4 -> Sort (cost=6.34..6.35 rows=1 width=12) (actual time=0.041..0.041 rows=0 loops=1) Sort Key: big_table.created_at DESC Sort Method: quicksort Memory: 25kB Buffers: shared hit=4 -> Index Scan using big_table_idx_user_action_transfers on big_table (cost=0.56..6.33 rows=1 width=12) (actual time=0.022..0.022 rows=0 loops=1) Index Cond: ((user_id = 874) AND (action = 1)) Filter: (created_at >= '2018-12-09 23:00:00'::timestamp without time zone) Buffers: shared hit=4 Planning time: 0.299 ms Execution time: 0.104 ms (14 rows)

What's the data type of the created_at column? There might be a datatype narrowing/widening operation going on that prohibits the use of the index. Try eliminating the cast to timestamp without timezone for the sake of understanding what's going on. Will that change the plan? — blubb
– blubb, Commented Apr 21, 2020 at 12:13
@a_horse_with_no_name, thanks for the tip, I'll update the question accordingly. — Dirk von Grünigen
– Dirk von Grünigen, Commented Apr 21, 2020 at 12:22
@blubb Unfortunately I can't change the query as mentioned above. The created_at column is of type timestamp without time zone. Removing the cast operation does not change anything in the EXPLAIN output. — Dirk von Grünigen
– Dirk von Grünigen, Commented Apr 21, 2020 at 12:24

jjanes · Accepted Answer · 2020-04-21 14:38:26Z

It can use one index to provide the filter, and then do the sort. Or it can use the other to provide the ORDER BY, and then stop early based on the LIMIT. It has to choose, as it can't do both. PostgreSQL has no way of knowing that everything with "big_table"."user_id" = 13 AND "big_table"."action" = 1 were also created a long time ago, so it doesn't know that the stopping early based on the LIMIT will not actually stop very early.

It is hard to figure out what your question is. You seem to know what the answer is, build the index on (user_id, action, created_at). So do that, if you want the performance problem solved.

You say you are confused, and also that you can't change the query. Being able to change the query coming from your app is irrelevant to resolving your confusion. Even if unfortunate tooling limitations prevent you from implementing a solution, that doesn't prevent you from understanding either the solution or the problem.

Are you seeking understanding, or a solution?

Both my local and the remote tables have the same number of entries (~25mio.) and about the same data distribution

There are many dimensions of data distribution. Maybe they are similar in some, but not in others. Seeing the output of EXPLAIN (ANALYZE, BUFFERS) for both servers could really help, but likely would not be enough. It would also be nice to see EXPLAIN (ANALYZE, BUFFERS) from the slow server when it is using the fast plan. You can do this by dropping the wrong index, or by changing the query so it uses ORDER BY (big_table.created_at + interval '0') desc. You don't need to get your app to run this query, you can run it manually.

On second thought, seeing EXPLAIN (ANALYZE, BUFFERS) for the fast server running the slow plan might be even more useful. You can probably do that by changing the query to use ...WHERE ("big_table"."user_id" + 0 = 13) AND...

"has no way of knowing that everything with ... were also created a long time ago" - I wonder if this can be improved by creating extended statistics on those columns.
@a_horse_with_no_name I have not had luck with that. I think it just uses those for selectivity, and not for how selectivity interacts with ORDER BY. Maybe someday.
Thanks for your remarks, I just added the output of the query plan with said flags enabled. And to your question, I'm seeking a solution in the end, but I'd like to understand what the problem here was. Especially because if I don't understand the problem, I'm not entirely sure if the additional index fixes the problem.

Collectives™ on Stack Overflow

Why is the wrong index used for this query?

1 Answer 1

3 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Related