2

I have this DF for example

+----+------+------+-------+ | id | val1 | val2 | val3 | +----+------+------+-------+ | 1 | a | b | c | | 2 | d | e | f | | 3 | g | h | i | 

and I want the Columns val1, val2 and val3 transposed into Rows in this way:

+----+------+------+ | id | key | value| +----+------+------+ | 1 | val1 | a | | 1 | val2 | b | | 1 | val3 | c | | 2 | val1 | d | | 2 | val2 | e | | 2 | val3 | f | | 3 | val1 | g | | 3 | val2 | h | | 3 | val3 | i | 

How can I achieve that?

1 Answer 1

3

By exploding a Map structure it is possible to get such transformation. So first transform the 3 columns into a Map (based on https://stackoverflow.com/a/41291156/4725074) for then exploding.

So given the following Spark DF:

import pyspark.sql.functions as F from itertools import chain spark = SparkSession \ .builder \ .getOrCreate() d = [ {'id': 1, 'val1': 'a', 'val2': 'b', 'val3': 'c'}, {'id': 2, 'val1': 'd', 'val2': 'e', 'val3': 'f'}, {'id': 3, 'val1': 'g', 'val2': 'h', 'val3': 'i'}, ] df = spark.createDataFrame(d) 

the solution would be:

m = F.create_map(list(chain(*( (F.lit(c), F.col(c)) for c in df.columns if c in ["val1", "val2", "val3"])))) df.withColumn('map', m)\ .select('*', F.explode('map')).drop('map') 

It's important to use a select() and not a witchColumn() as when exploding a Map column will generate 2 new columns, key and value.

The result will be the transposition of the selected columns into rows as reported in the example.

Sign up to request clarification or add additional context in comments.

2 Comments

I am trying to use above logic for my dataframe which has about 300 columns and I am passing a list I want here ["val1", "val2", "val3"], but above code doesnot give transpose my columns to rows, it ias as is
it might be that 300 columns is a big number and this is an expensive operations... try with just a small sample and check if that works

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.