PostgreSQL - GROUP subsequent rows

Question

I have a table which contains some records ordered by date.

And I want to get start and end dates for each subsequent group (grouped by some criteria e.g.position).

create table tbl (id int, date timestamp without time zone, position int); insert into tbl values ( 1 , '2013-12-01', 1), ( 2 , '2013-12-02', 2), ( 3 , '2013-12-03', 2), ( 4 , '2013-12-04', 2), ( 5 , '2013-12-05', 3), ( 6 , '2013-12-06', 3), ( 7 , '2013-12-07', 2), ( 8 , '2013-12-08', 2)

Of course if I simply group by position I will get wrong result as positions could be the same for different groups:

SELECT POSITION, min(date) MIN, max(date) MAX FROM tbl GROUP BY POSITION

I will get:

POSITION MIN MAX 1 December, 01 2013 00:00:00+0000 December, 01 2013 00:00:00+0000 3 December, 05 2013 00:00:00+0000 December, 06 2013 00:00:00+0000 2 December, 02 2013 00:00:00+0000 December, 08 2013 00:00:00+0000

But I want:

POSITION MIN MAX 1 December, 01 2013 00:00:00+0000 December, 01 2013 00:00:00+0000 2 December, 02 2013 00:00:00+0000 December, 04 2013 00:00:00+0000 3 December, 05 2013 00:00:00+0000 December, 06 2013 00:00:00+0000 2 December, 07 2013 00:00:00+0000 December, 08 2013 00:00:00+0000

I found a solution for MySql which uses variables and I could port it but I believe PostgreSQL can do it in some smarter way using its advanced features like window functions.

I'm using PostgreSQL 9.2

Jerzy Pawlikowski · Accepted Answer · 2013-12-07 21:25:07Z

There is probably more elegant solution but try this:

WITH tmp_tbl AS ( SELECT *, CASE WHEN lag(position,1) OVER(ORDER BY id)=position THEN position ELSE ROW_NUMBER() OVER(ORDER BY id) END AS grouping_col FROM tbl ) , tmp_tbl2 AS( SELECT position,date, CASE WHEN lag(position,1)OVER(ORDER BY id)=position THEN lag(grouping_col,1) OVER(ORDER BY id) ELSE ROW_NUMBER() OVER(ORDER BY id) END AS grouping_col FROM tmp_tbl ) SELECT POSITION, min(date) MIN, max(date) MAX FROM tmp_tbl2 GROUP BY grouping_col,position

David Aldridge · Accepted Answer · 2013-12-07 21:14:21Z

There are some complete answers on Stackoverflow for that, so I'll not repeat them in detail, but the principle of it is to group the records according to the difference between:

The row number when ordered by the date (via a window function)
The difference between the dates and a static date of reference.

So you have a series such as:

rownum datediff diff 1 1 0 ^ 2 2 0 | first group 3 3 0 v 4 5 1 ^ 5 6 1 | second group 6 7 1 v 7 9 2 ^ 8 10 2 v third group

Collectives™ on Stack Overflow

PostgreSQL - GROUP subsequent rows

2 Answers 2

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Linked

Related