0

I am having the following schema in spark and would like to flatten it.

root |-- binlog_read_timestamp: string (nullable = true) |-- row: struct (nullable = true) | |-- after_values: struct (nullable = true) | | |-- id: long (nullable = true) | |-- before_values: struct (nullable = true) | | |-- id: long (nullable = true) | |-- values: struct (nullable = true) | | |-- id: long (nullable = true) |-- schema: string (nullable = true) |-- table: string (nullable = true) |-- type: string (nullable = true) 

So depends on the value of type, I want to do the following thing:

IF type == A THEN add new column with after_values.id IF type == B THEN add new column with before_values.id IF type == C THEN add new column with values.id 

Any suggestions on how to do it? Thanks!

1
  • Any comment for down vote would be appreciate. Commented Oct 11, 2017 at 14:16

1 Answer 1

2

Try

from pyspark.sql.functions import * df.withColumn("new_column", when(col("type") == "A", col("after_values.id")) \ .when(col("type") == "B", col("before_values.id")) \ .when(col("type") == "C", col("values.id"))) 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.