splitting timestamp in to two rows in postgresql

Question

I have two timestamps in the table:

 usage_from | usage_till ---------------------+-------------------- 2013-10-09 23:08:17 | 2013-10-09 23:16:00 2013-10-09 23:08:17 | 2013-10-09 23:08:19 2013-10-09 23:08:17 | 2013-10-10 18:58:22 2013-10-09 23:08:17 | 2013-10-09 23:15:05 2013-10-09 23:08:17 | 2013-10-09 23:09:00 2013-10-09 23:08:17 | 2013-10-09 23:08:20 2013-10-09 23:08:17 | 2013-10-09 23:32:04 2013-10-09 23:08:17 | 2013-10-10 02:02:03 2013-10-09 23:08:17 | 2013-10-10 07:31:00 2013-10-09 23:08:17 | 2013-10-10 22:41:04

This I need to split into as below:

 usage_from | usage_till ---------------------+----------------------- 2013-10-09 23:08:17 | 2013-10-09 23:16:00 2013-10-09 23:08:17 | 2013-10-09 23:08:19 2013-10-09 23:08:17 | 2013-10-10 02:00:00 2013-10-10 02:00:00 | 2013-10-10 18:58:22 -- splitted 2013-10-09 23:08:17 | 2013-10-09 23:15:05 2013-10-09 23:08:17 | 2013-10-09 23:09:00 2013-10-09 23:08:17 | 2013-10-09 23:08:20 2013-10-09 23:08:17 | 2013-10-09 23:32:04 2013-10-09 23:08:17 | 2013-10-10 02:00:00 2013-10-10 02:00:00 | 2013-10-10 02:02:03 -- splitted 2013-10-09 23:08:17 | 2013-10-10 02:00:00 2013-10-10 02:00:00 | 2013-10-10 07:31:00 -- splitted 2013-10-09 23:08:17 | 2013-10-10 02:00:00 2013-10-10 02:00:00 | 2013-10-10 22:41:04 -- splitted

Here in the above example I have split the timestamp at 02:00:00.

After many trials I could split it as below, but couldn't split as different row.

 usage_from | usage_till | end_time_1 | end_time_2 ---------------------+---------------------+---------------------+--------------------- 2013-10-09 23:08:17 | 2013-10-09 23:16:00 | 2013-10-09 23:16:00 | 2013-10-11 02:00:00 2013-10-09 23:08:17 | 2013-10-09 23:08:19 | 2013-10-09 23:08:19 | 2013-10-11 02:00:00 2013-10-09 23:08:17 | 2013-10-10 18:58:22 | 2013-10-10 02:00:00 | 2013-10-10 18:58:22 2013-10-09 23:08:17 | 2013-10-09 23:15:05 | 2013-10-09 23:15:05 | 2013-10-11 02:00:00 2013-10-09 23:08:17 | 2013-10-09 23:09:00 | 2013-10-09 23:09:00 | 2013-10-11 02:00:00 2013-10-09 23:08:17 | 2013-10-09 23:08:20 | 2013-10-09 23:08:20 | 2013-10-11 02:00:00 2013-10-09 23:08:17 | 2013-10-09 23:32:04 | 2013-10-09 23:32:04 | 2013-10-11 02:00:00 2013-10-09 23:08:17 | 2013-10-10 02:02:03 | 2013-10-10 02:00:00 | 2013-10-10 02:02:03 2013-10-09 23:08:17 | 2013-10-10 07:31:00 | 2013-10-10 02:00:00 | 2013-10-10 07:31:00 2013-10-09 23:08:17 | 2013-10-10 22:41:04 | 2013-10-10 02:00:00 | 2013-10-10 22:41:04

Any idea how to do this? I have been struggling for the last few days.
I'm using Redshift 1.0.757 (based on PostgreSQL 8.02).

"Im using PostgreSQL 8.0". Then why is this tagged MySQL? And really, 8.0? That's ancient. Upgrade urgently, assuming you're really using Pg 8.0, and not some fork/variant of PostgreSQL. Show SELECT version() output. — Craig Ringer
– Craig Ringer, Commented Mar 25, 2014 at 9:46
Version: PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.757 — DJo
– DJo, Commented Mar 25, 2014 at 9:50
Redshift isn't exactly Postgres. Much less MySQL, of course. — Erwin Brandstetter
– Erwin Brandstetter, Commented Mar 25, 2014 at 13:16

Erwin Brandstetter · Accepted Answer · 2014-08-27 22:13:22Z

If ¹ Redshift would support the basic form of generate_series(), this might work. At least this works in Postgres 8.3:

SELECT CASE WHEN split > 0 AND g = 0 THEN usage_from WHEN split > 0 AND g = 1 THEN usage_till::date + '2:0'::time ELSE usage_from END , CASE WHEN split > 0 AND g = 0 THEN usage_till::date + '2:0'::time WHEN split > 0 AND g = 1 THEN usage_till ELSE usage_till END FROM ( SELECT * , generate_series(0, split) AS g FROM ( SELECT * , (usage_till - '2:0'::time)::date - (usage_from - '2:0'::time)::date AS split -- results in integer FROM t ) sub1 ) sub2

How?

In the inner subquery sub1 I find whether the time range crosses 2 a.m. and save that in the column split. I am assuming the time range never crosses 2 a.m. twice, but the query could easily be adapted to that. generate_series() automatically generates 1 row per wrap around.
In the next subquery sub2 generate_series() generates two rows where a split is needed.
In the outer SELECT a CASE statement adjusts the timestamps accordingly.
Normally I would use interval '2 hours' instead of '2:0'::time, but I seem to remember Redshift doesn't support the interval type.

SQL Fiddle for Postgres 8.3.

Not in Redshift?

If Redshift only allows generate_series() in the FROM list and not in the SELECT list, you are out of luck. This already is the ancient form. In modern Postgres, you would use a LATERAL JOIN. You could try your luck with regexp_split_to_table(), but that's not in Postgres 8.0 either.

¹ But the manual says, generate_series() is unsupported.

Barring that, I can only think of a procedural solution with PL/pgSQL. But Redshift might be limited there, too ...

well appreciated your efforts and the explanation provided. Your example in SQL Fiddle is the perfect; but generate_series(integer,integer) is not supported in PostgreSQL 8.02. So in short my issue remains open.
Is there any way that we can handle the above solution in PostgreSQL 8.02 ??
but i can able to execute SELECT * FROM generate_series(2, 4) and SELECT * FROM generate_series(5,1,-1) it gives the answer.
@dhanishjose: Added some more clues. Otherwise I am fresh out of ideas.
Is there any chance that I get the actual function code so that I can explicitly create that function and then use in my code?

Collectives™ on Stack Overflow

splitting timestamp in to two rows in postgresql

1 Answer 1

How?

Not in Redshift?

7 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

How?

Not in Redshift?

7 Comments

Linked

Related