I have a problem currently with index usage of Postgresql on two different hosts (local & remote) for the same query. The query in question is the following:
SELECT COUNT(*) FROM ( SELECT 1 AS one FROM "big_table" WHERE "big_table"."user_id" = 13 AND "big_table"."action" = 1 AND (big_table.created_at >= '2018-12-09 23:00:00'::timestamp without time zone) ORDER BY big_table.created_at desc LIMIT 15 OFFSET 10 ) subquery_for_count; It's out of scope for me to change this query, since it's generated by a library we are using, so I'd like to find a solution without having to change it. If I run the EXPLAIN command locally with said query my Postgres instance outputs the following:
local_host=# EXPLAIN SELECT COUNT(*) FROM (SELECT 1 AS one FROM "big_table" WHERE "big_table"."user_id" = 13 AND "big_table"."action" = 1 AND (big_table.created_at >= '2018-12-09 23:00:00'::timestamp without time zone) ORDER BY big_table.created_at desc LIMIT 15 OFFSET 10) subquery_for_count; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=8.59..8.60 rows=1 width=8) -> Limit (cost=8.57..8.58 rows=1 width=12) -> Sort (cost=8.57..8.57 rows=1 width=12) Sort Key: big_table.created_at DESC -> Index Scan using big_table_idx_user_action_transfers on big_table (cost=0.56..8.56 rows=1 width=12) Index Cond: ((user_id = 13) AND (action = 1)) Filter: (created_at >= '2018-12-09 23:00:00'::timestamp without time zone) (7 rows) This is fine, it (partially) uses the compound index on user_id and action as expected. However, if I run the query on the remote system I get the following EXPLAIN output:
remote_host=# EXPLAIN SELECT COUNT(*) FROM (SELECT 1 AS one FROM "big_table" WHERE "big_table"."user_id" = 13 AND "big_table"."action" = 1 AND (big_table.created_at >= '2018-12-09 23:00:00'::timestamp without time zone) ORDER BY big_table.created_at desc LIMIT 15 OFFSET 10) subquery_for_count; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=8472.67..8472.68 rows=1 width=8) -> Limit (cost=3389.25..8472.48 rows=15 width=12) -> Index Scan Backward using index_big_table_on_created_at on big_table (cost=0.44..4492554.51 rows=13257 width=12) Index Cond: (created_at >= '2018-12-09 23:00:00'::timestamp without time zone) Filter: ((user_id = 13) AND (action = 1)) (5 rows) As one can see, on the remote host the database uses the index on created_at, rather than user_id and action as my local installation. This leads to this query being reeeeeeeally slow on the remote host (> 1min to complete), because there are a lot of entries which fulfill the index condition, and filtering all of those takes a lot of time. But, on my local installation it's quite quick (~1s to complete). Both my local and the remote tables have the same number of entries (~25mio.) and about the same data distribution. We run the Vacuum daemon on the remote host, so VACUUM ANALYZE is done quite frequently. Also, the indices are setup exactly the same on both systems.
I have already tried to search for solutions for this problem, but I haven't found anything useful until now apart from running VACUUM ANALYZE and making sure indices for the related attributes exist.
Maybe one of you guys has a clue? Of course I could add a compound index for all used attributes (user_id, action and created_at), but I'm still heavily confused as to why the "correct" index is not used in this case on the remote host.
Both hosts use the version 9.6 of Postgres (9.6.9 on the local host and 9.6.17 on the remote host to be exact).
Edit #1: Here's the output of the query plan with (ANALYZE, BUFFERS) set on the production host:
remote_host=# EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM (SELECT 1 AS one FROM "big_table" WHERE "big_table"."user_id" = 874 AND "big_table"."action" = 1 AND (big_table.created_at >= '2018-12-09 23:00:00') ORDER BY big_table.created_at desc LIMIT 15 OFFSET 10) subquery_for_count; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=8539.91..8539.92 rows=1 width=8) (actual time=206301.833..206301.833 rows=1 loops=1) Buffers: shared hit=16082155 read=550889 dirtied=3 -> Limit (cost=3416.15..8539.73 rows=15 width=12) (actual time=206301.826..206301.826 rows=0 loops=1) Buffers: shared hit=16082155 read=550889 dirtied=3 -> Index Scan Backward using index_big_table_on_created_at on big_table (cost=0.44..4515235.12 rows=13219 width=12) (actual time=67472.247..206301.821 rows=2 loops=1) Index Cond: (created_at >= '2018-12-09 23:00:00'::timestamp without time zone) Filter: ((user_id = 874) AND (action = 1)) Rows Removed by Filter: 26362926 Buffers: shared hit=16082155 read=550889 dirtied=3 Planning time: 2.003 ms Execution time: 206302.505 ms (11 rows) Edit #2: Here's the output of the query plan with (ANALYZE, BUFFERS) set on the local host:
local_host=# EXPLAIN (ANALYZE, BUFFERS) SELECT COUNT(*) FROM (SELECT 1 AS one FROM "big_table" WHERE "big_table"."user_id" = 874 AND "big_table"."action" = 1 AND (big_table.created_at >= '2018-12-09 23:00:00') ORDER BY big_table.created_at desc LIMIT 15 OFFSET 10) subquery_for_count; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=6.37..6.38 rows=1 width=8) (actual time=0.046..0.046 rows=1 loops=1) Buffers: shared hit=4 -> Limit (cost=6.35..6.35 rows=1 width=12) (actual time=0.043..0.043 rows=0 loops=1) Buffers: shared hit=4 -> Sort (cost=6.34..6.35 rows=1 width=12) (actual time=0.041..0.041 rows=0 loops=1) Sort Key: big_table.created_at DESC Sort Method: quicksort Memory: 25kB Buffers: shared hit=4 -> Index Scan using big_table_idx_user_action_transfers on big_table (cost=0.56..6.33 rows=1 width=12) (actual time=0.022..0.022 rows=0 loops=1) Index Cond: ((user_id = 874) AND (action = 1)) Filter: (created_at >= '2018-12-09 23:00:00'::timestamp without time zone) Buffers: shared hit=4 Planning time: 0.299 ms Execution time: 0.104 ms (14 rows)
created_atcolumn is of typetimestamp without time zone. Removing the cast operation does not change anything in theEXPLAINoutput.