0

I have the following "starting" query:

select fecha as date,velocidad as speed, velocidad>100 as overspeed from reports.avl_historico_354898046636089 where fecha between '2017-04-19 00:00:00-03' and '2017-04-20 00:00:00-03' and velocidad>2 and ignicion=1 order by fecha; 

Which yields the following output:

date speed overspeed 2017-04-19 11:35:41+00,16,f 2017-04-19 11:37:01+00,24,f 2017-04-19 11:37:41+00,72,f 2017-04-19 11:38:21+00,82,f 2017-04-19 11:39:01+00,13,f 2017-04-19 11:39:41+00,68,f 2017-04-19 11:40:21+00,23,f 2017-04-19 11:41:01+00,57,f 2017-04-19 11:41:41+00,97,f 2017-04-19 11:42:21+00,96,f 2017-04-19 11:43:01+00,102,t 2017-04-19 11:43:41+00,104,t 2017-04-19 11:44:21+00,106,t 2017-04-19 11:45:01+00,109,t 2017-04-19 11:45:41+00,109,t 2017-04-19 11:46:21+00,114,t 2017-04-19 11:47:01+00,56,f 2017-04-19 11:47:28+00,54,f 2017-04-19 11:47:41+00,54,f 2017-04-19 11:48:21+00,54,f 2017-04-19 11:49:01+00,102,t 2017-04-19 11:49:07+00,104,t 2017-04-19 11:54:21+00,114,t 2017-04-19 11:55:01+00,118,t 2017-04-19 11:55:41+00,115,t 2017-04-19 11:56:21+00,111,t 2017-04-19 11:57:01+00,85,f 2017-04-19 11:57:41+00,45,f 2017-04-19 11:58:21+00,29,f 2017-04-19 12:00:35+00,4,f 2017-04-19 12:00:36+00,4,f ... 

And I've been trying to work with LAG/LEAD to get the first/last date for each group of rows where the overspeed column is TRUE, but I haven't been able to achieve the desired results, which could be like this:

start stop 2017-04-19 11:43:01+00 2017-04-19 11:46:21+00 2017-04-19 11:49:01+00 2017-04-19 11:56:21+00 

Any ideas on how to get such output would be appreciated.

Original table DDL:

CREATE TABLE avl_historico_354898046636089 ( fecha timestamp with time zone NOT NULL, latitud double precision DEFAULT 0 NOT NULL, longitud double precision DEFAULT 0 NOT NULL, altitud double precision DEFAULT 0 NOT NULL, velocidad double precision DEFAULT 0 NOT NULL, cog double precision DEFAULT 0 NOT NULL, nsat integer DEFAULT 0 NOT NULL, tipo character(1), utc_hora time without time zone, fix_fecha date, imei bigint NOT NULL, registro timestamp with time zone, input1 integer DEFAULT 0, input2 integer DEFAULT 0, input3 integer DEFAULT 0, input4 integer DEFAULT 0, hdop double precision, adc double precision DEFAULT (-99), ignicion integer DEFAULT 1, adc2 double precision, power integer, driverid integer, ibutton2 integer, ibutton3 integer, ibutton4 integer, trailerid integer, adc3 double precision, adc4 double precision, horometro bigint, odometro bigint, panico integer DEFAULT 0, bateria double precision, bateriaint double precision ); 
4
  • 1
    please update out put with result of copy (select fecha as date,velocidad as speed, velocidad>100 as overspeed from reports.avl_historico_354898046636089 where fecha between '2017-04-19 00:00:00-03' and '2017-04-20 00:00:00-03' and velocidad>2 and ignicion=1 order by fecha) to stdin delimiter ','; Commented Apr 20, 2017 at 20:11
  • 1
    in short, try window functions min, max(date) over (partition by overspeed) and then lag lead to combine em Commented Apr 20, 2017 at 20:15
  • @VaoTsun just changed the output format and included the DDL Commented Apr 20, 2017 at 20:20
  • The DDL isn't enough, we need test data. Anyway, I've shown you how it should work in my answer. Commented Apr 20, 2017 at 20:24

3 Answers 3

1

It's a GROUPING AND WINDOW sample.

NOTE I've edited some result just to make it smaller.

create table test (fecha timestamp, velocidad int, overspeed bool); insert into test values ('2017-04-19 20:18:17+00', 77, FALSE), ('2017-04-19 20:18:57+00', 96, FALSE), ('2017-04-19 20:19:37+00', 108, TRUE), ('2017-04-19 20:20:17+00', 111, TRUE), ('2017-04-19 20:20:57+00', 114, TRUE), ('2017-04-19 20:21:37+00', 112, TRUE), ('2017-04-19 20:22:17+00', 108, FALSE), ('2017-04-19 20:22:57+00', 107, FALSE), ('2017-04-19 20:23:37+00', 113, FALSE), ('2017-04-19 20:24:17+00', 116, TRUE), ('2017-04-19 20:24:57+00', 111, TRUE), ('2017-04-19 20:25:37+00', 113, TRUE), ('2017-04-19 20:26:17+00', 115, FALSE), ('2017-04-19 20:26:28+00', 115, FALSE), ('2017-04-19 20:26:57+00', 115, TRUE), ('2017-04-19 20:27:37+00', 115, TRUE), ('2017-04-19 20:27:58+00', 60, FALSE); 
with ResetPoint as ( select fecha, velocidad, overspeed, case when lag(overspeed) over (order by fecha) = overspeed then null else 1 end as reset from test ) --= Set a group each time overspeed changes , SetGroup as ( select fecha, velocidad, overspeed, count(reset) over (order by fecha) as grp from ResetPoint ) select * from SetGroup; 
 fecha | velocidad | overspeed | grp :------------------ | --------: | :-------- | --: 2017-04-19 20:18:17 | 77 | f | 1 2017-04-19 20:18:57 | 96 | f | 1 2017-04-19 20:19:37 | 108 | t | 2 2017-04-19 20:20:17 | 111 | t | 2 2017-04-19 20:20:57 | 114 | t | 2 2017-04-19 20:21:37 | 112 | t | 2 2017-04-19 20:22:17 | 108 | f | 3 2017-04-19 20:22:57 | 107 | f | 3 2017-04-19 20:23:37 | 113 | f | 3 2017-04-19 20:24:17 | 116 | t | 4 2017-04-19 20:24:57 | 111 | t | 4 2017-04-19 20:25:37 | 113 | t | 4 2017-04-19 20:26:17 | 115 | f | 5 2017-04-19 20:26:28 | 115 | f | 5 2017-04-19 20:26:57 | 115 | t | 6 2017-04-19 20:27:37 | 115 | t | 6 2017-04-19 20:27:58 | 60 | f | 7 
--= Set a reset point each time overspeed changes -- with ResetPoint as ( select fecha, velocidad, overspeed, case when lag(overspeed) over (order by fecha) = overspeed then null else 1 end as reset from test ) --= Set a group each time overspeed changes , SetGroup as ( select fecha, velocidad, overspeed, count(reset) over (order by fecha) as grp from ResetPoint ) --= Retruns MIN and MAX date of each group select grp, min(fecha) as Start, max(fecha) as End from SetGroup group by grp; 
 grp | start | end --: | :------------------ | :------------------ 4 | 2017-04-19 20:24:17 | 2017-04-19 20:25:37 1 | 2017-04-19 20:18:17 | 2017-04-19 20:18:57 5 | 2017-04-19 20:26:17 | 2017-04-19 20:26:28 3 | 2017-04-19 20:22:17 | 2017-04-19 20:23:37 6 | 2017-04-19 20:26:57 | 2017-04-19 20:27:37 2 | 2017-04-19 20:19:37 | 2017-04-19 20:21:37 7 | 2017-04-19 20:27:58 | 2017-04-19 20:27:58 

dbfiddle here

Sign up to request clarification or add additional context in comments.

1 Comment

same comment as in the answer from Erwin Brandstetter, but I'll try to adjust it to the actual table to compare execution plans and timing
1
SELECT grp, min(date) AS start, max(date) AS stop FROM ( SELECT date, speed, count(is_reset) OVER () AS grp FROM ( SELECT date, speed, CASE WHEN overspeed <> lag(overspeed) OVER (ORDER BY date) THEN 1 END AS is_reset FROM ( select fecha as date,velocidad as speed, velocidad>100 as overspeed from reports.avl_historico_354898046636089 where fecha between '2017-04-19 00:00:00-03' and '2017-04-20 00:00:00-03' and velocidad>2 and ignicion=1 ) AS t ) AS t2 ) AS t3 GROUP BY grp; 

5 Comments

There's something wrong, as this query only creates a single row of output and, should be at least 2 with the data provided in the question (output: 8, 2017-04-19 11:07:06+00, 2017-04-19 20:39:54+00)
I don't believe you, prove it.
Still don't believe you.
What else should I provide?
Dump the data from reports.avl_historico_354898046636089 into the question. Show me the output you want. You can use COPY to do it quickly
1

This can be simpler. Subtract two row_number() calls:

SELECT min(date) AS start , max(date) AS stop FROM ( SELECT date, overspeed , row_number() OVER (ORDER BY date) - row_number() OVER (PARTITION BY overspeed ORDER BY date) AS grp FROM tbl -- result of your starting query ) sub WHERE overspeed GROUP BY grp ORDER BY grp; 

The 1st generates a running number over all, the 2nd partitioned by overspeed. When you subtract the second from the first, each group in ends up with same group numbers, distinct per partition.

Then filter the ones with overspeed in the outer query and take min & max per group. Voilá.

Detailed explanation:

Aside: a timestamp is not a date. That's a confusing column name.

Integrate your subquery

Addressing your comment. Replace tbl with your original query as subquery like this:

SELECT min(date) AS start , max(date) AS stop FROM ( SELECT date, overspeed , row_number() OVER (ORDER BY date) - row_number() OVER (PARTITION BY overspeed ORDER BY date) AS grp FROM ( SELECT fecha AS date, velocidad AS speed, velocidad > 100 AS overspeed FROM reports.avl_historico_354898046636089 WHERE fecha >= '2017-04-19 00:00:00-03' -- typically, you include the lower AND fecha < '2017-04-20 00:00:00-03' -- and exclude the upper bound AND velocidad > 2 AND ignicion = 1 -- drop the now useless inner ORDER BY ) sub1 ) sub2 WHERE overspeed GROUP BY grp ORDER BY grp; 

Then you can simplify some more:

SELECT min(fecha) AS start , max(fecha) AS stop FROM ( SELECT fecha, velocidad > 100 AS overspeed , row_number() OVER (ORDER BY fecha) - row_number() OVER (PARTITION BY velocidad > 100 ORDER BY fecha) AS grp FROM reports.avl_historico_354898046636089 WHERE fecha >= '2017-04-19 00:00:00-03' AND fecha < '2017-04-20 00:00:00-03' AND velocidad > 2 AND ignicion = 1 ) sub WHERE overspeed GROUP BY grp ORDER BY grp; 

3 Comments

sounds quite a simple answers, but it takes the output of my original query as if it was an actual table, which is not the real life scenario
@gvasquez; I added advice above.
now works perfectly as expected, but narrowing down the actual use case I had to add avg(velocidad) to the selected columns and, HAVING max(fecha)-min(fecha) > '2 minutes'::interval as an extra clause to complete fullfil the request. Thanks a lot Erwin

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.