Redshift SQL: Get date difference based on start and end dates

Question

My table has start_date and end_date from which I need to find the hour difference. The issue is that both of these date times are not on the same day.

user start_date end_date difference Alex 7/25/2016 16:00 7/26/2016 0:30 8.5 Alex 7/24/2016 16:00 7/25/2016 0:30 8.5 Alex 7/21/2016 16:00 7/22/2016 0:30 8.5 Alex 7/20/2016 16:00 7/21/2016 0:30 8.5 Alex 7/19/2016 16:00 7/20/2016 0:30 8.5 Alex 7/18/2016 16:00 7/19/2016 0:30 8.5 Alex 7/17/2016 16:00 7/18/2016 0:30 8.5 Alex 7/14/2016 16:00 7/15/2016 0:30 8.5 Alex 7/13/2016 16:00 7/14/2016 0:30 8.5 Alex 7/12/2016 16:00 7/13/2016 0:30 8.5 Alex 7/11/2016 16:00 7/12/2016 0:30 8.5 Alex 7/10/2016 16:00 7/11/2016 0:30 8.5

Usually it is 5 working days and I get the answer if I group them by start_date. But I need an new date column where I need the output as below. Please note that 15/7/2016 and 22/7/2016 was not present in the above table. I need the additional 0.5 hour & date for the 6th day to be included to my derived table.

User Date difference Alex 7/25/2016 8.5 Alex 7/24/2016 8.5 Alex 7/22/2016 0.5 Alex 7/21/2016 8.0 Alex 7/20/2016 8.5 Alex 7/19/2016 8.5 Alex 7/18/2016 8.5 Alex 7/17/2016 8.5 Alex 7/15/2016 0.5 Alex 7/14/2016 8.0 Alex 7/13/2016 8.5 Alex 7/12/2016 8.5 Alex 7/11/2016 8.5 Alex 7/10/2016 8.5

I calculate the difference as

round(cast(datediff(seconds, start_date, end_date) as decimal)/3600,2)

Can you clarify the pattern behind your desired calculation? Why is the 14th and the 21st split up into 8.0 for the current and 0.5 for the following day? Is this always the case for Thursdays? Or rather when the following day does not have a start_date of its own? — moertel
– moertel, Commented Jul 29, 2016 at 14:53
Can you clarify what you are trying to achieve? Are you saying that you want to know the number of hours between start & end that were 'consumed' on each day, broken down by that day? For example, Monday 10pm - Wednesday 3am would be 2 hours for Monday, 24 hours for Tuesday and 3 hours for Wednesday? — John Rotenstein
– John Rotenstein, Commented Jul 31, 2016 at 8:47
So the logic behind this is, i would need to break the total number of hours per day. Ex: If start_date = 1/1/2016 15:30 and end_date = 2/1/2016 00:30 then I need the break up with group by start_date: 8+0.5 = 8.5 for 1/1/2016. this continues till the end of the week. But on the last day of the week (day Friday), the total number will reflect as 8.5 which is the correct calculation. However, what I need is to split the additional 0.5 hour (which comes in Saturday) to be published on Saturday (i.e group by end_date) — Badri Prasad Rath
– Badri Prasad Rath, Commented Aug 1, 2016 at 8:05

moertel · Accepted Answer · 2016-07-29 16:50:50Z

Whenever there is sophisticated logic, I'd suggest to use union queries and split the logic into a select query (or even table) each. Then you'd be able to calculate this in two steps. The main difference seems to be whether the 0.5 between 00:00:00 and 00:30:00 should be counted to the previous workday or whether it should stand alone. The latter seems to be determined based on whether the end_date is also a workday itself. I see three cases:

Next day is a workday:
1. Report all hours on start_date
Next day is not a workday:
1. Report hours from start_date to midnight on start_date
2. Report hours from midnight to end_date on end_date

I used the following example data based on your description:

create temporary table _test (user varchar(20), start_date timestamp, end_date timestamp); insert into _test values ('Alex', '7/25/2016 16:00', '7/26/2016 0:30'), ('Alex', '7/24/2016 16:00', '7/25/2016 0:30'), ('Alex', '7/21/2016 16:00', '7/22/2016 0:30'), ('Alex', '7/20/2016 16:00', '7/21/2016 0:30'), ('Alex', '7/19/2016 16:00', '7/20/2016 0:30'), ('Alex', '7/18/2016 16:00', '7/19/2016 0:30'), ('Alex', '7/17/2016 16:00', '7/18/2016 0:30'), ('Alex', '7/14/2016 16:00', '7/15/2016 0:30'), ('Alex', '7/13/2016 16:00', '7/14/2016 0:30'), ('Alex', '7/12/2016 16:00', '7/13/2016 0:30'), ('Alex', '7/11/2016 16:00', '7/12/2016 0:30'), ('Alex', '7/10/2016 16:00', '7/11/2016 0:30');

We will need to know whether the next day is a workday, so I suggest using the lead() window function (see documentation) which will give you the start_date from the next row.

create temporary table _differences as ( select user_name , start_date::date as start_date , end_date::date as end_date /** * Calculate difference in hours between start_date and end_date: */ , round(cast(datediff(seconds, start_date, end_date) as decimal)/3600,2) as hours_start_to_end /** * Calculate difference in hours between start_date and midnight: */ , round(cast(datediff(seconds, start_date, dateadd(day, 1, start_date::date)) as decimal)/3600,2) as hours_start_to_midnight /** * Calculate difference between midnight on end_date and end_date: */ , round(cast(datediff(seconds, end_date::date, end_date) as decimal)/3600,2) as hours_midnight_to_end /** * Calculate number of days from end_date until next start_date: */ , datediff(day, end_date::date, lead(start_date::date) over(partition by user_name order by start_date::date)) as days_until_next_workday from _test );

Then the following query:

 select user_name as user_name , start_date as ref_date , hours_start_to_end as difference from _differences where days_until_next_workday = 0 -- report all work hours on start_date union select user_name as user_name , start_date as ref_date , hours_start_to_midnight as difference from _differences where days_until_next_workday > 0 -- report partial work hours on start_date union select user_name as user_name , end_date as ref_date , hours_midnight_to_end as difference from _differences where days_until_next_workday > 0 -- report partial work hours on end_date order by user_name , ref_date desc ;

Would yield the following result:

 user_name | ref_date | difference -----------+------------+------------ Alex | 2016-07-24 | 8.50 Alex | 2016-07-22 | 0.50 Alex | 2016-07-21 | 8.00 Alex | 2016-07-20 | 8.50 Alex | 2016-07-19 | 8.50 Alex | 2016-07-18 | 8.50 Alex | 2016-07-17 | 8.50 Alex | 2016-07-15 | 0.50 Alex | 2016-07-14 | 8.00 Alex | 2016-07-13 | 8.50 Alex | 2016-07-12 | 8.50 Alex | 2016-07-11 | 8.50 Alex | 2016-07-10 | 8.50 (13 rows)

You can see that 7/25/2016 is missing because there is no start_date on or after 7/26/2016, so you'll need to figure out how to account for that special case.

Perfect, actually I liked the use of union here. I am using like below, case when trunc(start_time) <> trunc(end_time) then cast(to_char(start_time,'yyyy-mm-dd 23:59:59') as timestamp) else cast(to_char(end_time,'yyyy-mm-dd hh24:mi:ss') as timestamp) end as st_t1
union case when trunc(start_time) <> trunc(end_time) then cast(to_char(end_time,'yyyy-mm-dd 00:00:00') as timestamp) else cast(to_char(end_time,'yyyy-mm-dd hh24:mi:ss') as timestamp) end as st_t2

moertel · Accepted Answer · 2016-08-04 17:54:21Z

here is how I have done the calc and it works perfectly

select user, trunc(start_time) as date1, SUM(case when id = 1 then round(cast(datediff(seconds, start_time, st_t1) as decimal)/3600,2) end) as SCHEDULE from ( select user, start_time, case when trunc(start_time) <> trunc(end_time) then cast(to_char(start_time,'yyyy-mm-dd 23:59:59') as timestamp) else cast(to_char(end_time,'yyyy-mm-dd hh24:mi:ss') as timestamp) end as st_t1 from table1 a where id = 1 group by user_name, trunc(start_time) union select user_name, trunc(end_time) as date1, SUM(case when id = 1 then round(cast(datediff(seconds, st_t2, end_time) as decimal)/3600,2) end) as SCHEDULE from ( select user_name, end_time, case when trunc(start_time) <> trunc(end_time) then cast(to_char(end_time,'yyyy-mm-dd 00:00:00') as timestamp) else cast(to_char(end_time,'yyyy-mm-dd hh24:mi:ss') as timestamp) end as st_t2 from table1 a where id = 1 ) group by user, trunc(end_time)

Collectives™ on Stack Overflow

Redshift SQL: Get date difference based on start and end dates

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related