How to flatten nested struct in spark

Question

I am having the following schema in spark and would like to flatten it.

root |-- binlog_read_timestamp: string (nullable = true) |-- row: struct (nullable = true) | |-- after_values: struct (nullable = true) | | |-- id: long (nullable = true) | |-- before_values: struct (nullable = true) | | |-- id: long (nullable = true) | |-- values: struct (nullable = true) | | |-- id: long (nullable = true) |-- schema: string (nullable = true) |-- table: string (nullable = true) |-- type: string (nullable = true)

So depends on the value of type, I want to do the following thing:

IF type == A THEN add new column with after_values.id IF type == B THEN add new column with before_values.id IF type == C THEN add new column with values.id

Any suggestions on how to do it? Thanks!

Any comment for down vote would be appreciate.

Chengzhi
– Chengzhi

2017-10-11 14:16:45 +00:00
Commented Oct 11, 2017 at 14:16 — Chengzhi
– Chengzhi, Commented Oct 11, 2017 at 14:16

Alper t. Turker · Accepted Answer · 2017-10-10 22:43:42Z

Try

from pyspark.sql.functions import * df.withColumn("new_column", when(col("type") == "A", col("after_values.id")) \ .when(col("type") == "B", col("before_values.id")) \ .when(col("type") == "C", col("values.id")))

Collectives™ on Stack Overflow

How to flatten nested struct in spark

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related