reshape dataframe from column to rows in scala

Question

I want to reshape a dataframe in Spark using scala . I found most of the example uses groupBy andpivot. In my case i dont want to use groupBy. This is how my dataframe looks like

 tagid timestamp value 1 1 2016-12-01 05:30:00 5 2 1 2017-12-01 05:31:00 6 3 1 2017-11-01 05:32:00 4 4 1 2017-11-01 05:33:00 5 5 2 2016-12-01 05:30:00 100 6 2 2017-12-01 05:31:00 111 7 2 2017-11-01 05:32:00 109 8 2 2016-12-01 05:34:00 95

And i want my dataframe to look like this,

 timestamp 1 2 1 2016-12-01 05:30:00 5 100 2 2017-12-01 05:31:00 6 111 3 2017-11-01 05:32:00 4 109 4 2017-11-01 05:33:00 5 NA 5 2016-12-01 05:34:00 NA 95

i used pivot without groupBy and it throws error.

df.pivot("tagid")

error: value pivot is not a member of org.apache.spark.sql.DataFrame.

How do i convert this? Thank you.

pivot requires RelationalGroupedDataset and hence groupBy. It's to say you've got no choice but groupBy before pivot. — Jacek Laskowski
– Jacek Laskowski, Commented Dec 1, 2017 at 9:26
Thank you. Are there any other approach to reshape the dataframe, other than pivoting method? Actually groupBy and pivot worked fine for me. But i still would like to know any other approaches are possible — Jonreyan
– Jonreyan, Commented Dec 2, 2017 at 7:03
I don't think so. The problem looks as a good match for pivot (and so groupBy). — Jacek Laskowski
– Jacek Laskowski, Commented Dec 2, 2017 at 12:49

Anahcolus · Accepted Answer · 2017-12-01 07:17:08Z

Doing the following should solve your issue.

df.groupBy("timestamp").pivot("tagId").agg(first($"value"))

you should have final dataframe as

+-------------------+----+----+ |timestamp |1 |2 | +-------------------+----+----+ |2017-11-01 05:33:00|5 |null| |2017-11-01 05:32:00|4 |109 | |2017-12-01 05:31:00|6 |111 | |2016-12-01 05:30:00|5 |100 | |2016-12-01 05:34:00|null|95 | +-------------------+----+----+

for more information you can checkout databricks blog

Interesting since the solution is against... "In my case i dont want to use groupBy." :)

Collectives™ on Stack Overflow

reshape dataframe from column to rows in scala

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related