0

I have a table which contains some records ordered by date.

And I want to get start and end dates for each subsequent group (grouped by some criteria e.g.position).

Example:

create table tbl (id int, date timestamp without time zone, position int); insert into tbl values ( 1 , '2013-12-01', 1), ( 2 , '2013-12-02', 2), ( 3 , '2013-12-03', 2), ( 4 , '2013-12-04', 2), ( 5 , '2013-12-05', 3), ( 6 , '2013-12-06', 3), ( 7 , '2013-12-07', 2), ( 8 , '2013-12-08', 2) 

Of course if I simply group by position I will get wrong result as positions could be the same for different groups:

SELECT POSITION, min(date) MIN, max(date) MAX FROM tbl GROUP BY POSITION 

I will get:

POSITION MIN MAX 1 December, 01 2013 00:00:00+0000 December, 01 2013 00:00:00+0000 3 December, 05 2013 00:00:00+0000 December, 06 2013 00:00:00+0000 2 December, 02 2013 00:00:00+0000 December, 08 2013 00:00:00+0000 

But I want:

POSITION MIN MAX 1 December, 01 2013 00:00:00+0000 December, 01 2013 00:00:00+0000 2 December, 02 2013 00:00:00+0000 December, 04 2013 00:00:00+0000 3 December, 05 2013 00:00:00+0000 December, 06 2013 00:00:00+0000 2 December, 07 2013 00:00:00+0000 December, 08 2013 00:00:00+0000 

I found a solution for MySql which uses variables and I could port it but I believe PostgreSQL can do it in some smarter way using its advanced features like window functions.

I'm using PostgreSQL 9.2

0

2 Answers 2

1

There is probably more elegant solution but try this:

WITH tmp_tbl AS ( SELECT *, CASE WHEN lag(position,1) OVER(ORDER BY id)=position THEN position ELSE ROW_NUMBER() OVER(ORDER BY id) END AS grouping_col FROM tbl ) , tmp_tbl2 AS( SELECT position,date, CASE WHEN lag(position,1)OVER(ORDER BY id)=position THEN lag(grouping_col,1) OVER(ORDER BY id) ELSE ROW_NUMBER() OVER(ORDER BY id) END AS grouping_col FROM tmp_tbl ) SELECT POSITION, min(date) MIN, max(date) MAX FROM tmp_tbl2 GROUP BY grouping_col,position 
Sign up to request clarification or add additional context in comments.

Comments

1

There are some complete answers on Stackoverflow for that, so I'll not repeat them in detail, but the principle of it is to group the records according to the difference between:

  • The row number when ordered by the date (via a window function)
  • The difference between the dates and a static date of reference.

So you have a series such as:

rownum datediff diff 1 1 0 ^ 2 2 0 | first group 3 3 0 v 4 5 1 ^ 5 6 1 | second group 6 7 1 v 7 9 2 ^ 8 10 2 v third group 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.