I need to calculate the total duration of events in a given time interval and on a given day of the week with PostgreSQL 11
For an event on the same day no problem. But over several days I manage to approximate the result with "generate_series".u Is there another way faster and more precise?
Interval In:
Start---------------------------------------End E.begin_date ------ E.end_date -> E.end_date - E.begin_date Start--------------------------------End E.begin_date -------------------E.end_date -> End - E.begin_date Start--------------------------------End E.begin_date -----------------------E.end_date -> E.end_date - Start Start------------------End E.begin_date-------------------------------------E.end_date -> End - Start Interval In on several Days with generate_series
Day 1 Day 2 etc.... | | Start-------End | Start---------End | Start---------End E.begin_date--------------------------------------------------------------E.end_date Interval Out:
Start--------End E.begin_date ------------------- E.end_date -> (Start - E.begin_date) + (E.end_date - End) Start--------------------------------End E.begin_date -------------------E.end_date -> E.end_date - End Start--------------------------------End E.begin_date -----------------------E.end_date -> Start - .begin_date Start--------End E.begin_date--------E.end_date -> E.end_date - E.begin_date Interval Out on several Days with generate_series
Day 1 Day 2 etc.... | | Start-------End | Start---------End | Start---------End E.begin_date--------------------------------------------------------------E.end_date This is my sql request:
select * FROM (SELECT sum( case when (begin_date AT TIME ZONE 'Europe/Paris')::date = (end_date AT TIME ZONE 'Europe/Paris')::date -- same days then case -- ------time_begin------------------------------------------------------------------------time_end---- -- begin_date------------------------------------------------end_date when (begin_date AT TIME ZONE 'Europe/Paris')::time >= '08:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time <= '18:00'::time then (end_date AT TIME ZONE 'Europe/Paris')::time - (begin_date AT TIME ZONE 'Europe/Paris')::time -- ------time_begin------------------------------------------------------------------------time_end---- -- begin_date------------------------------------------------end_date when (begin_date AT TIME ZONE 'Europe/Paris')::time >= '08:00'::time AND (begin_date AT TIME ZONE 'Europe/Paris')::time <= '18:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time >= '18:00'::time then '18:00'::time - (begin_date AT TIME ZONE 'Europe/Paris') ::time -- -----------------time_begin------------------------------------------------------------------------time_end---- -- begin_date-----------------------------------end_date when (end_date AT TIME ZONE 'Europe/Paris')::time >= '08:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time <= '18:00'::time AND (begin_date AT TIME ZONE 'Europe/Paris')::time <= '08:00'::time then (end_date AT TIME ZONE 'Europe/Paris')::time - '08:00'::time -- -----------------time_begin-------------------------------------time_end---- -- begin_date----------------------------------------------------------------end_date when (begin_date AT TIME ZONE 'Europe/Paris')::time <= '08:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time >= '18:00'::time then '18:00'::time - '08:00'::time END when (begin_date AT TIME ZONE 'Europe/Paris')::date < (end_date AT TIME ZONE 'Europe/Paris')::date -- on many days then end_date - begin_date (select COUNT(*) * interval '5 min' FROM generate_series(begin_date AT TIME ZONE 'Europe/Paris', end_date AT TIME ZONE 'Europe/Paris', interval '5 min') m where (m::time AT TIME ZONE 'Europe/Paris') >= '08:00'::time AND (m::time AT TIME ZONE 'Europe/Paris') <= '18:00'::time AND EXTRACT(dow FROM m) + 1 in (2, 3) ) END) as total_included FROM "event" WHERE ( (EXTRACT(dow FROM "event"."begin_date" AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) OR EXTRACT(dow FROM "event"."end_date" AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) ) AND ("event"."begin_date" AT TIME ZONE 'Europe/Paris')::date >= '2019-12-01T00:00:00'::date AND ("event"."begin_date" AT TIME ZONE 'Europe/Paris')::date <= '2020-01-01T00:00:00'::date AND ( ((event.begin_date AT TIME ZONE 'Europe/Paris')::time between '08:00'::time AND '18:00'::time) OR ((event.end_date AT TIME ZONE 'Europe/Paris')::time between '08:00'::time AND '18:00'::time) OR ( ((event.begin_date AT TIME ZONE 'Europe/Paris')::time < '08:00'::time) AND ((event.end_date AT TIME ZONE 'Europe/Paris')::time > '18:00'::time)) )) order by total_included desc) as included, (SELECT sum( case when (begin_date AT TIME ZONE 'Europe/Paris')::date = (end_date AT TIME ZONE 'Europe/Paris')::date -- same day then case -- -----------------------------------------------------------------time_begin---------------------------------time_end---- -- begin_date-------------------------------end_date -- or -- -------------------------------time_begin---------------------------------time_end----------- -- begin_date----------------------------------------------------------------------end_date when ((begin_date AT TIME ZONE 'Europe/Paris')::time NOT BETWEEN '08:00'::time AND '18:00'::time) AND ((end_date AT TIME ZONE 'Europe/Paris')::time NOT BETWEEN '08:00'::time AND '18:00'::time) then case when ((begin_date AT TIME ZONE 'Europe/Paris')::time < '08:00'::time) AND ((end_date AT TIME ZONE 'Europe/Paris')::time < '08:00'::time) then end_date::time - begin_date::time when (begin_date AT TIME ZONE 'Europe/Paris')::time > '18:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time > '18:00'::time then end_date::time - begin_date::time when (begin_date AT TIME ZONE 'Europe/Paris')::time < '08:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time > '18:00'::time then ('08:00'::time - (begin_date AT TIME ZONE 'Europe/Paris')::time) + ((end_date AT TIME ZONE 'Europe/Paris')::time - '18:00'::time) end -- --------------------------------------time_begin---------------------------------time_end---- -- begin_date-----------------------------------end_date when ((begin_date AT TIME ZONE 'Europe/Paris')::time NOT BETWEEN '08:00'::time AND '18:00'::time) AND ((end_date AT TIME ZONE 'Europe/Paris')::time BETWEEN '08:00'::time AND '18:00'::time) then '08:00'::time - (begin_date AT TIME ZONE 'Europe/Paris')::time -- ----------time_begin---------------------------------time_end---- -- begin_date-----------------------------------end_date when ((begin_date AT TIME ZONE 'Europe/Paris')::time BETWEEN '08:00'::time AND '18:00'::time) AND ((end_date AT TIME ZONE 'Europe/Paris')::time NOT BETWEEN '08:00'::time AND '18:00'::time) then (end_date AT TIME ZONE 'Europe/Paris')::time - '18:00'::time end when (begin_date AT TIME ZONE 'Europe/Paris')::date < (end_date AT TIME ZONE 'Europe/Paris')::date --many days then (select COUNT(*) * interval '5 min' FROM generate_series(begin_date , end_date, interval '5 min') m where (m AT TIME ZONE 'Europe/Paris')::time >= '08:00'::time AND (m AT TIME ZONE 'Europe/Paris')::time <= '18:00'::time AND EXTRACT(dow FROM m AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) ) END) as total_excluded FROM "event" WHERE ((EXTRACT(dow FROM "event"."begin_date" AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) OR EXTRACT(dow FROM "event"."end_date" AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) ) AND ("event"."date_creation" AT TIME ZONE 'Europe/Paris')::date >= '2019-12-01T00:00:00'::date AND ("event"."date_creation" AT TIME ZONE 'Europe/Paris')::date <= '2020-01-01T00:00:00'::date AND ( ((event.begin_date AT TIME ZONE 'Europe/Paris')::time not between '08:00'::time AND '18:00'::time) OR ((event.end_date AT TIME ZONE 'Europe/Paris')::time not between '08:00'::time AND '18:00'::time) OR ( ((event.begin_date AT TIME ZONE 'Europe/Paris')::time > '08:00'::time) AND ((event.end_date AT TIME ZONE 'Europe/Paris')::time < '18:00'::time)) )) order by total_excluded desc) as excluded Data Sample for interval IN ('8am', '6pm') for Wednesday Thursday:
|---------|------------------|------------------|------------------| | id | begin_date | end_date | Result(Duration) | |---------|------------------|------------------|------------------| | 1 | 2020-01-01 10:00 | 2020-01-01 12:00 | 02:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 2 | 2020-01-01 10:00 | 2020-01-01 20:00 | 08:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 3 | 2020-01-01 07:00 | 2020-01-01 14:00 | 06:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 4 | 2020-01-01 07:00 | 2020-01-01 19:00 | 10:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 5 | 2020-01-01 08:00 | 2020-01-03 18:00 | 20:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 6 | 2020-01-01 09:00 | 2020-01-05 17:00 | 16:00:00 | |---------|------------------|------------------|------------------| Data Sample for interval OUT ('8am', '6pm') for Wednesday Thursday:
|---------|------------------|------------------|------------------| | id | begin_date | end_date | Result(Duration) | |---------|------------------|------------------|------------------| | 1 | 2020-01-01 10:00 | 2020-01-01 12:00 | 00:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 2 | 2020-01-01 10:00 | 2020-01-01 20:00 | 02:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 3 | 2020-01-01 07:00 | 2020-01-01 14:00 | 01:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 4 | 2020-01-01 07:00 | 2020-01-01 19:00 | 02:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 5 | 2020-01-01 07:00 | 2020-01-03 18:00 | 02:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 6 | 2020-01-01 09:00 | 2020-01-05 17:00 | 02:00:00 | |---------|------------------|------------------|------------------| The input parameters are:
- start date and end date
- start time and end time
- a list of the days of the week
To reach a result where we have the total duration of events within the interval, but also the total duration of events outside the interval of hours. All this for a list of days of the given week.
Days of the week means that if an event lasts several days say from Monday to Wednesday, but in the filter there are only Monday and Tuesday, the duration of Wednesday will not be counted.
The current query works but to handle events that take place over several days I use generate_series, which is not very efficient. The question is how to improve this query.