You will have to find some mechanism to create map of properties struct. I have used udf function to zip the key and values and return arrays of key and value.
import org.apache.spark.sql.functions._ def collectUdf = udf((cols: collection.mutable.WrappedArray[String], values: collection.mutable.WrappedArray[String]) => cols.zip(values))
Multiple generators are not supported in spark so you will have to save the dataframe to temporary dataframe.
val columnsMap = df_json.select($"event.properties.*").columns val temp = df_json.withColumn("event_properties", explode(collectUdf(lit(columnsMap), array($"event.properties.*"))))
The last step would be to just separate the event_properties column
temp.select($"event.event_category", $"event.event_name", $"event_properties._1".as("properties_key"), $"event_properties._2".as("properties_value")).show(false)
You should have what you desire
+--------------+----------+--------------+----------------+ |event_category|event_name|properties_key|properties_value| +--------------+----------+--------------+----------------+ |abc |click |prop1 |prop1Value | |abc |click |prop2 |prop2Value | +--------------+----------+--------------+----------------+