I have a dataset like below
+-------+--------+----------+----+----------+ |cytid |itemcode|percentile|mean|DTXN | +---------------+-----------+---+----------+ |800 | 7 | x4| y# |2/1/2017 | |700 | 13 | y#| oz |3/1/2017 | |800 | 24 | oz| oz |8/1/2017 | |900 | 747 | oz| IH |27/9/2017 | |700 | 798 | IH| yc |19/10/2017| |800 | 841 | yc| g5 |8/11/2017 | |800 | 867 | g5| IH |19/11/2017| |900 | 911 | IH| x4 |2/12/2017 | |800 | 945 | x4| ko |13/12/2017| |800 | 984 | ko| cd |27/12/2017| +-------+-------+----------+----+----------+ I need to produce an output like below
+-------+--------------------------------------+ |cytid | ipm | +----------------------------------------------+ |800 | x4,y#,oz,oz,yc,g5,g5,IH,x4,ko,ko,cd| |700 | y#,oz,IH,yc | |900 | oz,IH,IH,x4 | +-------+--------------------------------------+ To be precise i need to merge the columns itemcode, percentile, mean. So how to do this in scala