1

I have a table below with sample data.

Date&Time Message
13/07/2022 8:59:09 Normal
13/07/2022 8:59:10 Normal
13/07/2022 8:59:11 Normal
13/07/2022 8:59:12 Warning
13/07/2022 8:59:13 Warning
13/07/2022 8:59:14 Warning
13/07/2022 8:59:15 Warning
13/07/2022 8:59:16 Error
13/07/2022 8:59:17 Error
13/07/2022 8:59:18 Warning
13/07/2022 8:59:19 Warning
13/07/2022 8:59:20 Warning
13/07/2022 8:59:21 Normal
13/07/2022 8:59:22 Normal
13/07/2022 8:59:23 Normal
13/07/2022 8:59:24 Warning
13/07/2022 8:59:25 Warning
13/07/2022 8:59:26 Warning
13/07/2022 8:59:27 Error
13/07/2022 8:59:28 Error

I need to write a Postgres query which will give me output like:

Start Date&Time End Date&Time
13/07/2022 8:59:12 13/07/2022 8:59:15
13/07/2022 8:59:24 13/07/2022 8:59:26

The scenario is - if there is any 'Error' message, I need to take the start time of 'Warning' and end time of Warning, and if there is no 'Error' message after 'Warning' ignore it e.g. after 13/07/2022 8:59:20 Warning' there is no error, so query should ignore that range. Please help to find a query for this.

Setup table queries:

CREATE TABLE test_data ( id integer PRIMARY KEY , message VARCHAR(10) , datetimestamp timestamp without time zone NOT NULL ); SET datestyle = 'DMY'; INSERT INTO test_data VALUES (09, 'Normal' , '13/07/2022 8:59:09') , (10, 'Normal' , '13/07/2022 8:59:10') , (11, 'Normal' , '13/07/2022 8:59:11') , (12, 'Warning', '13/07/2022 8:59:12') , (13, 'Warning', '13/07/2022 8:59:13') , (14, 'Warning', '13/07/2022 8:59:14') , (15, 'Warning', '13/07/2022 8:59:15') , (16, 'Error' , '13/07/2022 8:59:16') , (17, 'Error' , '13/07/2022 8:59:17') , (18, 'Warning', '13/07/2022 8:59:18') , (19, 'Warning', '13/07/2022 8:59:19') , (20, 'Warning', '13/07/2022 8:59:20') , (21, 'Normal' , '13/07/2022 8:59:21') , (22, 'Normal' , '13/07/2022 8:59:22') , (23, 'Normal' , '13/07/2022 8:59:23') , (24, 'Warning', '13/07/2022 8:59:24') , (25, 'Warning', '13/07/2022 8:59:25') , (26, 'Warning', '13/07/2022 8:59:26') , (27, 'Error' , '13/07/2022 8:59:27') , (28, 'Error' , '13/07/2022 8:59:28') ; 
1

2 Answers 2

1

This is a typical problem.

SELECT min(datetimestamp) FILTER (WHERE message = 'Warning') AS start_warning , max(datetimestamp) FILTER (WHERE message = 'Warning') AS end_warning FROM ( SELECT * , count(*) FILTER (WHERE message = 'Warning' AND last_msg <> 'Warning') OVER (ORDER BY datetimestamp) AS grp FROM ( SELECT datetimestamp, message , lag(message) OVER (ORDER BY datetimestamp) AS last_msg FROM test_data ) sub1 WHERE message IN ('Warning', 'Error') ) sub2 GROUP BY grp HAVING bool_or(message = 'Error' AND last_msg = 'Warning'); 

db<>fiddle here

The query starts a new group whenever a 'Warning' is preceded by something else. And it only reports times when the 'Warning' block is followed by an 'Error'. You can have any number of different message types (except for NULL), these are the only conditions.

The expensive part is the window function. (In particular when using different window frames.) This query makes do with a single window function and a single sort order. So it should be as fast (and simple) as it gets.

Related:

About the aggregate FILTER:

Sign up to request clarification or add additional context in comments.

1 Comment

thank you for reply. Along with the start and end datetime I need to fetch two more column details(Posted separately - stackoverflow.com/questions/73176222/…). Is there any way to achieve this ?
1

Please refer to the below script.

WITH cte AS ( SELECT id, message, datetimestamp, Extract( second FROM datetimestamp )::bigint - rank() over(partition BY message ORDER BY datetimestamp) AS gap FROM ( SELECT id, message, datetimestamp FROM ( SELECT id, message, datetimestamp , lag(message) OVER (ORDER BY datetimestamp) , lead(message) OVER (ORDER BY datetimestamp) , lag(message,1) OVER (ORDER BY datetimestamp) IN ('Normal', 'Warning') AS step1 , lead(message,1) OVER (ORDER BY datetimestamp) IN ('Warning', 'Error') AS step2 , lag(message,2) OVER (ORDER BY datetimestamp) IN ('Normal', 'Warning') AS step3 , lead(message,2) OVER (ORDER BY datetimestamp) IN ('Warning', 'Error') AS step4 FROM test_data ORDER BY datetimestamp) sub1 WHERE step1 AND step2 AND step3 AND step4 AND message = 'Warning' ) sub ) SELECT gap, min(datetimestamp), max(datetimestamp) FROM cte GROUP BY gap 

There is one sample

1 Comment

thank you for reply. Along with the start and end datetime I need to fetch two more column details(Posted separately - stackoverflow.com/questions/73176222/…). Is there any way to achieve this ?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.