Return rows that are max of one column in Postgresql

Question

Sample data from my table test_table:

date symbol value created_time 2010-01-09 symbol1 101 3847474847 2010-01-10 symbol1 102 3847474847 2010-01-10 symbol1 102.5 3847475500 2010-01-10 symbol2 204 3847474847 2010-01-11 symbol1 109 3847474847 2010-01-12 symbol1 105 3847474847 2010-01-12 symbol2 206 3847474847

Given the table above, I am trying to find the optimal index to put on the table (date, symbol, value and created_time should combined be unique) and the query to go along with it to return the following:

date symbol value created_time 2010-01-09 symbol1 101 3847474847 2010-01-10 symbol1 102.5 3847475500 2010-01-10 symbol2 204 3847474847 2010-01-11 symbol1 109 3847474847 2010-01-12 symbol1 105 3847474847 2010-01-12 symbol2 206 3847474847

I am looking for date, symbol, value columns of data for each group of those three with the maximum created_time column (essentially row 1, 3, 4, 5, 6, 7 in the example above returned).

Currently I have tried this index...

CREATE UNIQUE INDEX "test_table_date_symbol_value_created_time" ON "test_table" USING btree (date, symbol, value, created_time)

And am using this query. Not sure if it is the most effective way, it still seems pretty slow.

select * from( select date, symbol, value, created_time, max(created_time) over (partition by date, symbol) as max_created_time from "test_table" ) t where symbol in ('symbol1', 'symbol2') and created_time = max_created_time

So you want the latest row for every unique date, symbol and value combination? — Bohemian
– Bohemian ♦, Commented Jan 25, 2017 at 18:17
Yes exactly, sorry I am pretty inexperienced with SQL so have been trying to read a lot and take an educated guess before asking. — Trevor Nederlof
– Trevor Nederlof, Commented Jan 25, 2017 at 18:25
Sorry I meant to say the latest row for each date, symbol combination. — Trevor Nederlof
– Trevor Nederlof, Commented Jan 25, 2017 at 19:11

Bohemian · Accepted Answer · 2017-01-25 20:05:45Z

Postgres supports window functions that suit this situation:

select date, symbol, value, created_time from (select *, rank() over (partition by date, symbol order by created_time desc) as rownum from test_table) x where rownum = 1

For every combination of date, symbol, this query returns the value and created_time from the row with the highest (ie last) created_time of that date and symbol.

I would suggest this index:

CREATE UNIQUE INDEX test_table_idx ON test_table (date, symbol, created_time, value)

It's a covering index (has all values you need for the query, obviating the need to access the actual table, and which you already had), but note that created_time comes before value, so data is already in its partition order, and value is the least important attribute, because it doesn't participate in any determination of which row to return.

Interesting that you for the query, how could I modify this to only give the latest date, symbol unique combination (so only the latest value would be given). I apologize I answered your comment above incorrectly.
Thank you for the explanations, they make a lot of sense and seems faster at least in my testing so far than my previous attempts. Although moving created_time before value did seem to make it slower but I dont know that much about indices so perhaps its faster in the long run with some more testing (I have about 2400 symbols and about 11 million rows right now).
@Travor try executing analyze test_table then re-trying the queries.

Clodoaldo Neto · Accepted Answer · 2017-01-25 19:43:47Z

Postgresql has the very handy distinct on

select distinct on (symbol, date) * from t order by symbol, date, created_time desc

https://www.postgresql.org/docs/current/static/sql-select.html#SQL-DISTINCT

wildplasser · Accepted Answer · 2017-01-25 19:40:30Z

Alternative:

SELECT * FROM test_table tt WHERE NOT EXISTS ( SELECT * FROM test_table nx WHERE nx."date" = tt."date" AND nx.symbol = tt.symbol AND nx.created_time > tt.created_time );

Collectives™ on Stack Overflow

Return rows that are max of one column in Postgresql

3 Answers 3

4 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

Comments

Linked

Related