Linked Questions

4 votes
3 answers
4k views

We use melt and dcast to convert data from wide->long and long->wide format. Refer http://seananderson.ca/2013/10/19/reshape.html for more details. Either scala or SparkR is fine. I've gone through ...
sag's user avatar
  • 5,471
2 votes
3 answers
4k views

I have a pyspark data frame with multiple columns as follows: name col1 col2 col3 A 1 6 7 B 2 7 6 C 3 8 5 D 4 9 ...
user3642360's user avatar
0 votes
1 answer
4k views

Converting structType to MapType in Spark. Schema: event: struct (nullable = true) | | event_category: string (nullable = true) | | event_name: string (nullable = true) | | properties: ...
shiva.n404's user avatar
0 votes
0 answers
3k views

I have already gone through the above answers and posted my concerns in comments, so please do not close this before answering my comments. I went through some of the answers available but none was ...
GAURAV JINDAL's user avatar
0 votes
0 answers
3k views

I am using Spark SQL 2.2.0 and DataFrame/DataSet API. I need to explode several columns one per row. I have: +------+------+------+------+------+ |col1 |col2 |col3 |col4 |col5 | +------+------+-...
alex-arkhipov's user avatar
3 votes
2 answers
2k views

I am working on Databricks using Python 2. I have a PySpark dataframe like: |Germany|USA|UAE|Turkey|Canada... |5 | 3 |3 |42 | 12.. Which, as you can see, consists of hundreds of columns and ...
Madno's user avatar
  • 920
0 votes
1 answer
930 views

I have a data frame and I would like to use Scala to explode rows into multiple rows using the values in multiple columns. Ideally I am looking to replicate the behavior of the R function melt(). All ...
Raphvanns's user avatar
  • 1,934
1 vote
1 answer
349 views

How can I use pyspark or pandas to achieve the below transformation? Thanks a lot. Source File is csv with following info: expected:
Jennifer Stone's user avatar
93 votes
10 answers
136k views

I am starting to use Spark DataFrames and I need to be able to pivot the data to create multiple columns out of 1 column with multiple rows. There is built in functionality for that in Scalding and I ...
J Calbreath's user avatar
  • 2,745
68 votes
6 answers
65k views

Is there an equivalent of Pandas Melt function in Apache Spark in PySpark or at least in Scala? I was running a sample dataset till now in Python and now I want to use Spark for the entire dataset.
Venkatesh Durgumahanthi's user avatar
51 votes
9 answers
95k views

I'm trying to transpose some columns of my table to row. I'm using Python and Spark 1.5.0. Here is my initial table: +-----+-----+-----+-------+ | A |col_1|col_2|col_...| +-----+-------------------...
Raouf's user avatar
  • 1,289
3 votes
2 answers
7k views

I have seen a few solutions to unpivot a spark dataframe when the number of columns is reasonably low and that the columns' names can be hardcoded. Do you have a scalable solution to unpivot a ...
mobupu's user avatar
  • 255
3 votes
3 answers
7k views

I have some data like this: ID Value1 Value2 Value40 101 3 520 2001 102 29 530 2020 I want to take this data and convert in to a KV style pair instead ID ValueVv ValueDesc 101 3 Value1 101 520 Value2 ...
Adam's user avatar
  • 43
3 votes
2 answers
2k views

I saw this question here: Transpose DataFrame Without Aggregation in Spark with scala and I wanted to do exactly the opposite. I have this Dataframe with a single row, with values that are string, ...
Marcelo Li Koga's user avatar
0 votes
1 answer
4k views

(Using apache spark version 1.6) I referred below link to attempt unpivot feature: unpivot in spark-sql/pyspark The issue here I'm getting some runtime exception when executing : df.select($"A", ...
Shabeel's user avatar
  • 75

15 30 50 per page