0

I have below dataframe in spark scala dataframe:

------------- a | b| c| d | ------------- 1 | 2| 3 | 4 | 5 | 6| 7 | 8 | 9 | 10| 11 | 12 | 13 | 14| 15 | 16 | 

From my code it becomes a map of every rows and,code I try is:

df.select(map(df.columns.flatMap(c => Seq(lit(c),col(c))):_*).as("map")) Map(String-> String) with 4 records only Map(a->1,b->2,c->3,d->4) Map(a->5,b->6,c->7,d->8) Map(a->9,b->10,c->11,d->12) Map(a->13,b->14,c->15,d->16) 

But I wanted to change like below:

a->1 b->2 c->3 d->4 a->5 b->6 c->7 d->8 a->9 b->10 c->11 d->12 a->13 b->14 c->15 d->16 

Any suggestion to change/add code to get desired result, I think it should be any transpose logic I am kind of new in scala .

4
  • map can not store duplicate columns, how you want to handle this ? Commented Nov 16, 2020 at 8:01
  • We can create 2 columns out of it k,v from the map and remove duplicate on row basis and can store in a file Commented Nov 16, 2020 at 9:16
  • Do you want two columns k, v ? Commented Nov 16, 2020 at 9:29
  • yes, k can be repeat but combination of k,v can be unique Commented Nov 16, 2020 at 9:32

1 Answer 1

1

Use explode to explode map data.Try below code.

df.select(map(df.columns.flatMap(c => Seq(lit(c),col(c))):_*).as("map")) .select(explode($"map")) .show(false) 

Without map used array

val colExpr = array( df .columns .flatMap(c => Seq(struct(lit(c).as("key"),col(c).as("value")).as("map"))):_* ).as("map") 
df .select(colExpr) .select(explode($"map").as("map")) .select($"map.*").show(false) 
Sign up to request clarification or add additional context in comments.

1 Comment

Great, is there a way I can retain duplicate records as well I mean instead of map can we implement list as well

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.