I have a column 'true_recoms' in spark dataframe:
-RECORD 17----------------------------------------------------------------- item | 20380109 true_recoms | {"5556867":1,"5801144":5,"7397596":21} I need to 'explode' this column to get something like this:
item | 20380109 recom_item | 5556867 recom_cnt | 1 .............. item | 20380109 recom_item | 5801144 recom_cnt | 5 .............. item | 20380109 recom_item | 7397596 recom_cnt | 21 I've tried to use from_json but its doesnt work:
schema_json = StructType(fields=[ StructField("item", StringType()), StructField("recoms", StringType()) ]) df.select(col("true_recoms"),from_json(col("true_recoms"), schema_json)).show(5) +--------+--------------------+------+ | item| true_recoms|true_r| +--------+--------------------+------+ |31746548|{"32731749":3,"31...| [,]| |17359322|{"17359392":1,"17...| [,]| |31480894|{"31480598":1,"31...| [,]| | 7265665|{"7265891":1,"503...| [,]| |31350949|{"32218698":1,"31...| [,]| +--------+--------------------+------+ only showing top 5 rows