0

I need to calculate the total duration of events in a given time interval and on a given day of the week with PostgreSQL 11

For an event on the same day no problem. But over several days I manage to approximate the result with "generate_series".u Is there another way faster and more precise?

Interval In:

Start---------------------------------------End E.begin_date ------ E.end_date -> E.end_date - E.begin_date Start--------------------------------End E.begin_date -------------------E.end_date -> End - E.begin_date Start--------------------------------End E.begin_date -----------------------E.end_date -> E.end_date - Start Start------------------End E.begin_date-------------------------------------E.end_date -> End - Start 

Interval In on several Days with generate_series

 Day 1 Day 2 etc.... | | Start-------End | Start---------End | Start---------End E.begin_date--------------------------------------------------------------E.end_date 

Interval Out:

 Start--------End E.begin_date ------------------- E.end_date -> (Start - E.begin_date) + (E.end_date - End) Start--------------------------------End E.begin_date -------------------E.end_date -> E.end_date - End Start--------------------------------End E.begin_date -----------------------E.end_date -> Start - .begin_date Start--------End E.begin_date--------E.end_date -> E.end_date - E.begin_date 

Interval Out on several Days with generate_series

 Day 1 Day 2 etc.... | | Start-------End | Start---------End | Start---------End E.begin_date--------------------------------------------------------------E.end_date 

This is my sql request:

select * FROM (SELECT sum( case when (begin_date AT TIME ZONE 'Europe/Paris')::date = (end_date AT TIME ZONE 'Europe/Paris')::date -- same days then case -- ------time_begin------------------------------------------------------------------------time_end---- -- begin_date------------------------------------------------end_date when (begin_date AT TIME ZONE 'Europe/Paris')::time >= '08:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time <= '18:00'::time then (end_date AT TIME ZONE 'Europe/Paris')::time - (begin_date AT TIME ZONE 'Europe/Paris')::time -- ------time_begin------------------------------------------------------------------------time_end---- -- begin_date------------------------------------------------end_date when (begin_date AT TIME ZONE 'Europe/Paris')::time >= '08:00'::time AND (begin_date AT TIME ZONE 'Europe/Paris')::time <= '18:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time >= '18:00'::time then '18:00'::time - (begin_date AT TIME ZONE 'Europe/Paris') ::time -- -----------------time_begin------------------------------------------------------------------------time_end---- -- begin_date-----------------------------------end_date when (end_date AT TIME ZONE 'Europe/Paris')::time >= '08:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time <= '18:00'::time AND (begin_date AT TIME ZONE 'Europe/Paris')::time <= '08:00'::time then (end_date AT TIME ZONE 'Europe/Paris')::time - '08:00'::time -- -----------------time_begin-------------------------------------time_end---- -- begin_date----------------------------------------------------------------end_date when (begin_date AT TIME ZONE 'Europe/Paris')::time <= '08:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time >= '18:00'::time then '18:00'::time - '08:00'::time END when (begin_date AT TIME ZONE 'Europe/Paris')::date < (end_date AT TIME ZONE 'Europe/Paris')::date -- on many days then end_date - begin_date (select COUNT(*) * interval '5 min' FROM generate_series(begin_date AT TIME ZONE 'Europe/Paris', end_date AT TIME ZONE 'Europe/Paris', interval '5 min') m where (m::time AT TIME ZONE 'Europe/Paris') >= '08:00'::time AND (m::time AT TIME ZONE 'Europe/Paris') <= '18:00'::time AND EXTRACT(dow FROM m) + 1 in (2, 3) ) END) as total_included FROM "event" WHERE ( (EXTRACT(dow FROM "event"."begin_date" AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) OR EXTRACT(dow FROM "event"."end_date" AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) ) AND ("event"."begin_date" AT TIME ZONE 'Europe/Paris')::date >= '2019-12-01T00:00:00'::date AND ("event"."begin_date" AT TIME ZONE 'Europe/Paris')::date <= '2020-01-01T00:00:00'::date AND ( ((event.begin_date AT TIME ZONE 'Europe/Paris')::time between '08:00'::time AND '18:00'::time) OR ((event.end_date AT TIME ZONE 'Europe/Paris')::time between '08:00'::time AND '18:00'::time) OR ( ((event.begin_date AT TIME ZONE 'Europe/Paris')::time < '08:00'::time) AND ((event.end_date AT TIME ZONE 'Europe/Paris')::time > '18:00'::time)) )) order by total_included desc) as included, (SELECT sum( case when (begin_date AT TIME ZONE 'Europe/Paris')::date = (end_date AT TIME ZONE 'Europe/Paris')::date -- same day then case -- -----------------------------------------------------------------time_begin---------------------------------time_end---- -- begin_date-------------------------------end_date -- or -- -------------------------------time_begin---------------------------------time_end----------- -- begin_date----------------------------------------------------------------------end_date when ((begin_date AT TIME ZONE 'Europe/Paris')::time NOT BETWEEN '08:00'::time AND '18:00'::time) AND ((end_date AT TIME ZONE 'Europe/Paris')::time NOT BETWEEN '08:00'::time AND '18:00'::time) then case when ((begin_date AT TIME ZONE 'Europe/Paris')::time < '08:00'::time) AND ((end_date AT TIME ZONE 'Europe/Paris')::time < '08:00'::time) then end_date::time - begin_date::time when (begin_date AT TIME ZONE 'Europe/Paris')::time > '18:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time > '18:00'::time then end_date::time - begin_date::time when (begin_date AT TIME ZONE 'Europe/Paris')::time < '08:00'::time AND (end_date AT TIME ZONE 'Europe/Paris')::time > '18:00'::time then ('08:00'::time - (begin_date AT TIME ZONE 'Europe/Paris')::time) + ((end_date AT TIME ZONE 'Europe/Paris')::time - '18:00'::time) end -- --------------------------------------time_begin---------------------------------time_end---- -- begin_date-----------------------------------end_date when ((begin_date AT TIME ZONE 'Europe/Paris')::time NOT BETWEEN '08:00'::time AND '18:00'::time) AND ((end_date AT TIME ZONE 'Europe/Paris')::time BETWEEN '08:00'::time AND '18:00'::time) then '08:00'::time - (begin_date AT TIME ZONE 'Europe/Paris')::time -- ----------time_begin---------------------------------time_end---- -- begin_date-----------------------------------end_date when ((begin_date AT TIME ZONE 'Europe/Paris')::time BETWEEN '08:00'::time AND '18:00'::time) AND ((end_date AT TIME ZONE 'Europe/Paris')::time NOT BETWEEN '08:00'::time AND '18:00'::time) then (end_date AT TIME ZONE 'Europe/Paris')::time - '18:00'::time end when (begin_date AT TIME ZONE 'Europe/Paris')::date < (end_date AT TIME ZONE 'Europe/Paris')::date --many days then (select COUNT(*) * interval '5 min' FROM generate_series(begin_date , end_date, interval '5 min') m where (m AT TIME ZONE 'Europe/Paris')::time >= '08:00'::time AND (m AT TIME ZONE 'Europe/Paris')::time <= '18:00'::time AND EXTRACT(dow FROM m AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) ) END) as total_excluded FROM "event" WHERE ((EXTRACT(dow FROM "event"."begin_date" AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) OR EXTRACT(dow FROM "event"."end_date" AT TIME ZONE 'Europe/Paris') + 1 in (2, 3) ) AND ("event"."date_creation" AT TIME ZONE 'Europe/Paris')::date >= '2019-12-01T00:00:00'::date AND ("event"."date_creation" AT TIME ZONE 'Europe/Paris')::date <= '2020-01-01T00:00:00'::date AND ( ((event.begin_date AT TIME ZONE 'Europe/Paris')::time not between '08:00'::time AND '18:00'::time) OR ((event.end_date AT TIME ZONE 'Europe/Paris')::time not between '08:00'::time AND '18:00'::time) OR ( ((event.begin_date AT TIME ZONE 'Europe/Paris')::time > '08:00'::time) AND ((event.end_date AT TIME ZONE 'Europe/Paris')::time < '18:00'::time)) )) order by total_excluded desc) as excluded 

Data Sample for interval IN ('8am', '6pm') for Wednesday Thursday:

|---------|------------------|------------------|------------------| | id | begin_date | end_date | Result(Duration) | |---------|------------------|------------------|------------------| | 1 | 2020-01-01 10:00 | 2020-01-01 12:00 | 02:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 2 | 2020-01-01 10:00 | 2020-01-01 20:00 | 08:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 3 | 2020-01-01 07:00 | 2020-01-01 14:00 | 06:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 4 | 2020-01-01 07:00 | 2020-01-01 19:00 | 10:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 5 | 2020-01-01 08:00 | 2020-01-03 18:00 | 20:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 6 | 2020-01-01 09:00 | 2020-01-05 17:00 | 16:00:00 | |---------|------------------|------------------|------------------| 

Data Sample for interval OUT ('8am', '6pm') for Wednesday Thursday:

|---------|------------------|------------------|------------------| | id | begin_date | end_date | Result(Duration) | |---------|------------------|------------------|------------------| | 1 | 2020-01-01 10:00 | 2020-01-01 12:00 | 00:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 2 | 2020-01-01 10:00 | 2020-01-01 20:00 | 02:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 3 | 2020-01-01 07:00 | 2020-01-01 14:00 | 01:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 4 | 2020-01-01 07:00 | 2020-01-01 19:00 | 02:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 5 | 2020-01-01 07:00 | 2020-01-03 18:00 | 02:00:00 | |---------|------------------|------------------|------------------| |---------|------------------|------------------|------------------| | 6 | 2020-01-01 09:00 | 2020-01-05 17:00 | 02:00:00 | |---------|------------------|------------------|------------------| 

The input parameters are:

  • start date and end date
  • start time and end time
  • a list of the days of the week

To reach a result where we have the total duration of events within the interval, but also the total duration of events outside the interval of hours. All this for a list of days of the given week.

Days of the week means that if an event lasts several days say from Monday to Wednesday, but in the filter there are only Monday and Tuesday, the duration of Wednesday will not be counted.

The current query works but to handle events that take place over several days I use generate_series, which is not very efficient. The question is how to improve this query.

1
  • Can't run the query in the question, it has a syntax error in the middle. Anyway I've attempted an answer with a much simpler query, please check it out. Commented Apr 15, 2020 at 11:43

1 Answer 1

1

Interpretation of the question

Given that an event starts at begin_date and is finished at end_date, the goal is to map this duration onto days/hours that seem like hours open to business, presumably to compute how much work time each event consumed.

Based on the query provided, the working hours are 08:00-18:00. It's not clear how you determine the working days. Sometimes the condition in the query is EXTRACT(dow FROM m) + 1 in (1), and at other places it's EXTRACT(dow FROM "event"."begin_date" AT TIME ZONE 'Europe/Paris') + 1 in (2) OR EXTRACT(dow FROM "event"."end_date" AT TIME ZONE 'Europe/Paris') + 1 in (2) )

The question says that there is a list of the days of the week in input but I don't see that in the query.

Solution with intersection of ranges

Below is a query that demonstrates how to use ranges to sum your "IN" intervals.

What it does:

  • generate the 08:00-18:00 ranges for each day between the start and end of event
  • filters the days to exclude based on the "day of week" number. If you had a calendar table instead, a join could be used.
  • computes the intersection of these ranges with the ranges of the event
  • sum the durations per event

Query

WITH working_days AS ( select id, tstzrange( begin_date::date + '8 hours'::interval+ n*interval '1 day' ,begin_date::date + '18 hours'::interval + n*interval '1 day' ) AS ref from (select id, begin_date, generate_series(0, extract(days from end_date-begin_date)::int, 1) as n from event) AS day_numbers where date_part('dow', begin_date::date + n*interval '1 day') between 1 and 5 ), ranges as ( select event.id, ref * tstzrange(begin_date,end_date) AS t_inc from working_days join event using(id) ) select id, sum(upper(t_inc)-lower(t_inc)) AS total_included from ranges group by id order by id ; 

Result

With the sample data the result is

 id | total_included ----+---------------- 1 | 02:00:00 2 | 08:00:00 3 | 06:00:00 4 | 10:00:00 5 | 30:00:00 6 | 29:00:00 

This matches the question except for id=6, but that might be because the dow filter doesn't match yours.

As for the OUT intervals, if I understand correctly I think it's just a difference, that is OUT = (end_date - begin_date) - IN for each event.

6
  • First of all thank you for your answer. For ID 6 in fact it's a mistake on my part on the dow. Every day in my initial request must be the same. The days of the week in the entry are in the clause : AND (EXTRACT(dow FROM "billing_transaction". "begin_date" AT TIME ZONE 'Europe/Paris') + 1 in (1) OR EXTRACT(dow FROM "billing_transaction". "end_date" AT TIME ZONE 'Europe/Paris') + 1 in (1) ) I've edited my question and correct data. Your awser is good Commented Apr 16, 2020 at 9:43
  • The results of total_included look good, however, I'd like to add up all the events to get a total duration. I was thinking sum(sum..) but that's not possible Commented Apr 16, 2020 at 9:44
  • @Hadrien: you can remove the group by id order by id to have a grand total. Or you can move the whole query into a subquery and sum total_included at the top level. Commented Apr 16, 2020 at 11:36
  • @Hadrien: when substracting a range from a bigger range that entirely contains it, the result is two ranges, so that can't fit into a single tstzrange value. But if excluded = (end_date-begin_date) - included, why bother with a more complicated way to compute it? Commented Apr 17, 2020 at 10:39
  • Yes I understand, but for total_excluded for a event with begin_date 2019-12-01 18:00:00 and end_date 2019-12-02 08:00:00 with filter start_time 18:00and end_time 23:00and dow Sunday the result is 1 hour because all the time of the monday not matter because of dow filter Commented Apr 17, 2020 at 12:57

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.