how can I transpose one spark DataFrame in such a way:
From:
| Key | Value |
|---|---|
| Key1 | Value1 |
| Key2 | Value2 |
| Key3 | Value3 |
TO:
| Key1 | Key2 | Key3 |
|---|---|---|
| Value1 | Value2 | Value3 |
Thanks!
how can I transpose one spark DataFrame in such a way:
From:
| Key | Value |
|---|---|
| Key1 | Value1 |
| Key2 | Value2 |
| Key3 | Value3 |
TO:
| Key1 | Key2 | Key3 |
|---|---|---|
| Value1 | Value2 | Value3 |
Thanks!
You can apply pivot operation to transpose rows to columns.
from pyspark.sql import functions as F data = [("Key1", "Value1", ), ("Key2", "Value2", ), ("Key3", "Value3", ), ] df = spark.createDataFrame(data, ("Key", "Value", )) df.groupBy().pivot("Key").agg(F.first("Value")).show() """ +------+------+------+ | Key1| Key2| Key3| +------+------+------+ |Value1|Value2|Value3| +------+------+------+ """ df = spark.createDataFrame([('key1','value1'),('key2','value2'),('key3','value3')], ['key', 'value']) import pyspark.sql.functions as F df.groupBy().pivot('key').agg(F.first('value')).show() or
df.groupBy().pivot('key').agg({"value":"first"}).show() +------+------+------+ | key1| key2| key3| +------+------+------+ |value1|value2|value3| +------+------+------+