16

I'm using Postgres 9.3.5 and I have a large table in the database, currently it has over 25 million rows and it tends to get even larger rapidly. I'm trying to select specific rows (all unit_ids with only latest unit_timestamp for each of them) with a simple query like:

SELECT unit_id, max(unit_timestamp) AS latest_timestamp FROM all_units GROUP BY unit_id; 

Without any index, this query takes about 35 seconds to execute. With an index defined (CREATE INDEX partial_idx ON all_units (unit_id, unit_timestamp DESC);) the query time is shortened to around (only) 19 seconds.

I wonder if it will ever be possible to execute my query in even less time (like just few seconds) and if so, what steps should I take in order to optimize it even further?

My table structure dump looks like this:

CREATE TABLE "all_units" ( "unit_id" int4 NOT NULL, "unit_timestamp" timestamp(6) NOT NULL, "lon" float4, "lat" float4, "speed" float4, "status" varchar(255) COLLATE "default" ) ALTER TABLE "all_units" ADD PRIMARY KEY ("unit_id", "unit_timestamp"); 

The EXPLAIN (ANALYZE, BUFFERS) follows:

QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=663998.38..664069.73 rows=7135 width=12) (actual time=84715.050..84732.021 rows=11094 loops=1) Buffers: shared hit=192 read=286819 -> Seq Scan on ais_sorted (cost=0.00..538335.92 rows=25132492 width=12) (actual time=0.608..41264.196 rows=25132492 loops=1) Buffers: shared hit=192 read=286819 Total runtime: 84746.501 ms 

and my psql settings on server look like this:

 name | context | min_val | max_val | boot_val -------------------------------------+------------+-----------+--------------+----------------------------------------- allow_system_table_mods | postmaster | | | off application_name | user | | | archive_command | sighup | | | archive_mode | postmaster | | | off archive_timeout | sighup | 0 | 1073741823 | 0 array_nulls | user | | | on authentication_timeout | sighup | 1 | 600 | 60 autovacuum | sighup | | | on autovacuum_analyze_scale_factor | sighup | 0 | 100 | 0.1 autovacuum_analyze_threshold | sighup | 0 | 2147483647 | 50 autovacuum_freeze_max_age | postmaster | 100000000 | 2000000000 | 200000000 autovacuum_max_workers | postmaster | 1 | 8388607 | 3 autovacuum_multixact_freeze_max_age | postmaster | 10000000 | 2000000000 | 400000000 autovacuum_naptime | sighup | 1 | 2147483 | 60 autovacuum_vacuum_cost_delay | sighup | -1 | 100 | 20 autovacuum_vacuum_cost_limit | sighup | -1 | 10000 | -1 autovacuum_vacuum_scale_factor | sighup | 0 | 100 | 0.2 autovacuum_vacuum_threshold | sighup | 0 | 2147483647 | 50 backslash_quote | user | | | safe_encoding bgwriter_delay | sighup | 10 | 10000 | 200 bgwriter_lru_maxpages | sighup | 0 | 1000 | 100 bgwriter_lru_multiplier | sighup | 0 | 10 | 2 block_size | internal | 8192 | 8192 | 8192 bonjour | postmaster | | | off bonjour_name | postmaster | | | bytea_output | user | | | hex check_function_bodies | user | | | on checkpoint_completion_target | sighup | 0 | 1 | 0.5 checkpoint_segments | sighup | 1 | 2147483647 | 3 checkpoint_timeout | sighup | 30 | 3600 | 300 checkpoint_warning | sighup | 0 | 2147483647 | 30 client_encoding | user | | | SQL_ASCII client_min_messages | user | | | notice commit_delay | superuser | 0 | 100000 | 0 commit_siblings | user | 0 | 1000 | 5 config_file | postmaster | | | constraint_exclusion | user | | | partition cpu_index_tuple_cost | user | 0 | 1.79769e+308 | 0.005 cpu_operator_cost | user | 0 | 1.79769e+308 | 0.0025 cpu_tuple_cost | user | 0 | 1.79769e+308 | 0.01 cursor_tuple_fraction | user | 0 | 1 | 0.1 data_checksums | internal | | | off data_directory | postmaster | | | DateStyle | user | | | ISO, MDY db_user_namespace | sighup | | | off deadlock_timeout | superuser | 1 | 2147483647 | 1000 debug_assertions | user | | | off debug_pretty_print | user | | | on debug_print_parse | user | | | off debug_print_plan | user | | | off debug_print_rewritten | user | | | off default_statistics_target | user | 1 | 10000 | 100 default_tablespace | user | | | default_text_search_config | user | | | pg_catalog.simple default_transaction_deferrable | user | | | off default_transaction_isolation | user | | | read committed default_transaction_read_only | user | | | off default_with_oids | user | | | off dynamic_library_path | superuser | | | $libdir effective_cache_size | user | 1 | 2147483647 | 16384 effective_io_concurrency | user | 0 | 1000 | 1 enable_bitmapscan | user | | | on enable_hashagg | user | | | on enable_hashjoin | user | | | on enable_indexonlyscan | user | | | on enable_indexscan | user | | | on enable_material | user | | | on enable_mergejoin | user | | | on enable_nestloop | user | | | on enable_seqscan | user | | | on enable_sort | user | | | on enable_tidscan | user | | | on escape_string_warning | user | | | on event_source | postmaster | | | PostgreSQL exit_on_error | user | | | off external_pid_file | postmaster | | | extra_float_digits | user | -15 | 3 | 0 from_collapse_limit | user | 1 | 2147483647 | 8 fsync | sighup | | | on full_page_writes | sighup | | | on geqo | user | | | on geqo_effort | user | 1 | 10 | 5 geqo_generations | user | 0 | 2147483647 | 0 geqo_pool_size | user | 0 | 2147483647 | 0 geqo_seed | user | 0 | 1 | 0 geqo_selection_bias | user | 1.5 | 2 | 2 geqo_threshold | user | 2 | 2147483647 | 12 gin_fuzzy_search_limit | user | 0 | 2147483647 | 0 hba_file | postmaster | | | hot_standby | postmaster | | | off hot_standby_feedback | sighup | | | off ident_file | postmaster | | | ignore_checksum_failure | superuser | | | off ignore_system_indexes | backend | | | off integer_datetimes | internal | | | on IntervalStyle | user | | | postgres join_collapse_limit | user | 1 | 2147483647 | 8 krb_caseins_users | sighup | | | off krb_server_keyfile | sighup | | | FILE:/etc/postgresql-common/krb5.keytab krb_srvname | sighup | | | postgres lc_collate | internal | | | C lc_ctype | internal | | | C lc_messages | superuser | | | lc_monetary | user | | | C lc_numeric | user | | | C lc_time | user | | | C listen_addresses | postmaster | | | localhost lo_compat_privileges | superuser | | | off local_preload_libraries | backend | | | lock_timeout | user | 0 | 2147483647 | 0 log_autovacuum_min_duration | sighup | -1 | 2147483647 | -1 log_checkpoints | sighup | | | off log_connections | backend | | | off log_destination | sighup | | | stderr log_directory | sighup | | | pg_log log_disconnections | backend | | | off log_duration | superuser | | | off log_error_verbosity | superuser | | | default log_executor_stats | superuser | | | off log_file_mode | sighup | 0 | 511 | 384 log_filename | sighup | | | postgresql-%Y-%m-%d_%H%M%S.log log_hostname | sighup | | | off log_line_prefix | sighup | | | log_lock_waits | superuser | | | off log_min_duration_statement | superuser | -1 | 2147483647 | -1 log_min_error_statement | superuser | | | error log_min_messages | superuser | | | warning log_parser_stats | superuser | | | off log_planner_stats | superuser | | | off log_rotation_age | sighup | 0 | 35791394 | 1440 log_rotation_size | sighup | 0 | 2097151 | 10240 log_statement | superuser | | | none log_statement_stats | superuser | | | off log_temp_files | superuser | -1 | 2147483647 | -1 log_timezone | sighup | | | GMT log_truncate_on_rotation | sighup | | | off logging_collector | postmaster | | | off maintenance_work_mem | user | 1024 | 2147483647 | 16384 max_connections | postmaster | 1 | 8388607 | 100 max_files_per_process | postmaster | 25 | 2147483647 | 1000 max_function_args | internal | 100 | 100 | 100 max_identifier_length | internal | 63 | 63 | 63 max_index_keys | internal | 32 | 32 | 32 max_locks_per_transaction | postmaster | 10 | 2147483647 | 64 max_pred_locks_per_transaction | postmaster | 10 | 2147483647 | 64 max_prepared_transactions | postmaster | 0 | 8388607 | 0 max_stack_depth | superuser | 100 | 2147483647 | 100 max_standby_archive_delay | sighup | -1 | 2147483647 | 30000 max_standby_streaming_delay | sighup | -1 | 2147483647 | 30000 max_wal_senders | postmaster | 0 | 8388607 | 0 password_encryption | user | | | on port | postmaster | 1 | 65535 | 5432 post_auth_delay | backend | 0 | 2147 | 0 pre_auth_delay | sighup | 0 | 60 | 0 quote_all_identifiers | user | | | off random_page_cost | user | 0 | 1.79769e+308 | 4 restart_after_crash | sighup | | | on search_path | user | | | "$user",public segment_size | internal | 131072 | 131072 | 131072 seq_page_cost | user | 0 | 1.79769e+308 | 1 server_encoding | internal | | | SQL_ASCII server_version | internal | | | 9.3.5 server_version_num | internal | 90305 | 90305 | 90305 session_replication_role | superuser | | | origin shared_buffers | postmaster | 16 | 1073741823 | 1024 shared_preload_libraries | postmaster | | | sql_inheritance | user | | | on ssl | postmaster | | | off ssl_ca_file | postmaster | | | ssl_cert_file | postmaster | | | server.crt ssl_ciphers | postmaster | | | DEFAULT:!LOW:!EXP:!MD5:@STRENGTH ssl_crl_file | postmaster | | | ssl_key_file | postmaster | | | server.key ssl_renegotiation_limit | user | 0 | 2147483647 | 524288 standard_conforming_strings | user | | | on statement_timeout | user | 0 | 2147483647 | 0 stats_temp_directory | sighup | | | pg_stat_tmp superuser_reserved_connections | postmaster | 0 | 8388607 | 3 synchronize_seqscans | user | | | on synchronous_commit | user | | | on synchronous_standby_names | sighup | | | syslog_facility | sighup | | | local0 syslog_ident | sighup | | | postgres tcp_keepalives_count | user | 0 | 2147483647 | 0 tcp_keepalives_idle | user | 0 | 2147483647 | 0 tcp_keepalives_interval | user | 0 | 2147483647 | 0 temp_buffers | user | 100 | 1073741823 | 1024 temp_file_limit | superuser | -1 | 2147483647 | -1 temp_tablespaces | user | | | TimeZone | user | | | GMT timezone_abbreviations | user | | | trace_notify | user | | | off trace_recovery_messages | sighup | | | log trace_sort | user | | | off track_activities | superuser | | | on track_activity_query_size | postmaster | 100 | 102400 | 1024 track_counts | superuser | | | on track_functions | superuser | | | none track_io_timing | superuser | | | off transaction_deferrable | user | | | off transaction_isolation | user | | | default transaction_read_only | user | | | off transform_null_equals | user | | | off unix_socket_directories | postmaster | | | /var/run/postgresql unix_socket_group | postmaster | | | unix_socket_permissions | postmaster | 0 | 511 | 511 update_process_title | superuser | | | on vacuum_cost_delay | user | 0 | 100 | 0 vacuum_cost_limit | user | 1 | 10000 | 200 vacuum_cost_page_dirty | user | 0 | 10000 | 20 vacuum_cost_page_hit | user | 0 | 10000 | 1 vacuum_cost_page_miss | user | 0 | 10000 | 10 vacuum_defer_cleanup_age | sighup | 0 | 1000000 | 0 vacuum_freeze_min_age | user | 0 | 1000000000 | 50000000 vacuum_freeze_table_age | user | 0 | 2000000000 | 150000000 vacuum_multixact_freeze_min_age | user | 0 | 1000000000 | 5000000 vacuum_multixact_freeze_table_age | user | 0 | 2000000000 | 150000000 wal_block_size | internal | 8192 | 8192 | 8192 wal_buffers | postmaster | -1 | 2147483647 | -1 wal_keep_segments | sighup | 0 | 2147483647 | 0 wal_level | postmaster | | | minimal wal_receiver_status_interval | sighup | 0 | 2147483 | 10 wal_receiver_timeout | sighup | 0 | 2147483647 | 60000 wal_segment_size | internal | 2048 | 2048 | 2048 wal_sender_timeout | sighup | 0 | 2147483647 | 60000 wal_sync_method | sighup | | | fdatasync wal_writer_delay | sighup | 1 | 10000 | 200 work_mem | user | 64 | 2147483647 | 1024 xmlbinary | user | | | base64 xmloption | user | | | content zero_damaged_pages | superuser | | | off 
10
  • Do you really need ALL unit_ids?Because a WHERE clause would help. Commented Sep 8, 2014 at 9:52
  • Unfortunately I do need ALL of them, and to make it even worse, right now I get "only" 11000+ units in return, but in the future I believe there will be 5-10 times more of them. Commented Sep 8, 2014 at 9:54
  • Try SELECT DISTINCT ON (unit_id),unit_timestamp FROM t ORDER BY unit_timestamp DESC A separate index on unit_timestamp would help. Commented Sep 8, 2014 at 10:10
  • is materialized view an option? Commented Sep 8, 2014 at 11:00
  • Hm, didn't know about materialized views at all. I will try to create one and see if that would be useful! Commented Sep 8, 2014 at 11:17

1 Answer 1

14

Query

Your query is forced to scan the whole table (or the whole index). Every row could be another distinct unit. The only way to substantially shorten the process would be a separate table with all available units - which would help as long as there are substantially fewer units than entries in all_units.
Since you have ~ 11k units (added in comment) for 25M entries, this should definitely help.

Depending on frequencies of values, there are a couple of query techniques to get your result considerably faster:

  • recursive CTE
  • JOIN LATERAL
  • correlated subquery

Details in this related answer on SO:

Only needing the implicit index of the primary key on (unit_id, unit_timestamp), this query should do the trick, using an implicit JOIN LATERAL:

SELECT u.unit_id, a.max_ts FROM unit u , (SELECT unit_timestamp AS max_ts FROM all_units WHERE unit_id = u.unit_id ORDER BY unit_timestamp DESC LIMIT 1 ) a; 

Excludes units without entry in all_units, like your original query.
Or a lowly correlated subquery (probably even faster):

SELECT u.unit_id , (SELECT unit_timestamp FROM all_units WHERE unit_id = u.unit_id ORDER BY unit_timestamp DESC LIMIT 1) AS max_ts FROM unit u; 

Includes units without entry in all_units.

Efficiency depends on the number of entries per unit. The more entries, the more potential for one of these queries.

In a quick local test with similar tables (500 "units", 1M rows in big table), the query with correlated subqueries was ~ 500x faster than your original. Index-only scans on the PK index of the big table vs. sequential scan in your original query.

Since your table tends to get even larger rapidly, a materialized view is probably not an option.

There is also DISTINCT ON as another possible query technique, but it's hardly going to be faster than your original query, so not the answer you are looking for. Details here:

Index

Your partial_idx:

CREATE INDEX partial_idx ON all_units (unit_id, unit_timestamp DESC);

is not in fact a partial index and also redundant. Postgres can scan indexes backwards at practically the same speed, the PK serves well. Drop this additional index.

Table layout

A couple of points for your table definition.

CREATE TABLE all_units ( unit_timestamp timestamp, unit_id int4, lon float4, lat float4, speed float4, status varchar(255), -- might be improved. PRIMARY KEY (unit_id, unit_timestamp) ); 
  • timestamp(6) doesn't make much sense, it's effectively the same as just timestamp, which already saves a maximum of 6 fractional digits.

  • I switched positions of the first two columns to save 4 bytes of padding, which amounts to ~ 100 MB for 25M rows (exact result depends on status). Smaller tables are typically faster for everything.

  • If status isn't free text, but some kind of standardized note, you could replace it with something a lot cheaper. More about varchar(255) in Postgres.

Server configuration

You need to configure your server. Most of your settings seem to be conservative defaults. 1 MB on shared_buffers or work_mem seems way to low for an installation with millions of rows. And random_pare_cost = 4 is to high for any modern system with plenty of RAM. Start with the manual and the Postgres Wiki:

11
  • For completeness, you should add the DISTINCT ON mthod in the list of query techniques for this problem (of course if OP wants to show all rows as he claims, and not just the 2 columns as the code shows) Commented Sep 8, 2014 at 14:00
  • @ypercube: For completeness, yes. But it's hardly going to be faster. Not the answer the OP is looking for. Commented Sep 8, 2014 at 14:02
  • Could you elaborate how a recursive CTE is going to be faster (or even: how it would solve the problem in the first place - I have never thought about recursive CTEs for a "max" problem) Commented Sep 8, 2014 at 15:04
  • 1
    @Mihai: no, it only has a bitmap index scan (that is it generates the bitmap index "on the fly") Commented Sep 8, 2014 at 17:00
  • 3
    @ErwinBrandstetter ALRIGHT! First of all thank you very very much for this elaborated answer, it was extremely helpful, I cannot explain you how much! :) Second, your query (the second one) is AWESOME! The query time is now around 0.2 seconds!!! :) I tried to play around with the first query also but I didn't manage to resolve the error I got: invalid reference to FROM-clause entry for table "u" in this WHERE clause in inner query. Sorry, but I'm not so good with databases myself so I didn't oppress my head with it since the second query helped already a lot. Commented Sep 9, 2014 at 9:54

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.