6

I have a column 'true_recoms' in spark dataframe:

-RECORD 17----------------------------------------------------------------- item | 20380109 true_recoms | {"5556867":1,"5801144":5,"7397596":21} 

I need to 'explode' this column to get something like this:

item | 20380109 recom_item | 5556867 recom_cnt | 1 .............. item | 20380109 recom_item | 5801144 recom_cnt | 5 .............. item | 20380109 recom_item | 7397596 recom_cnt | 21 

I've tried to use from_json but its doesnt work:

 schema_json = StructType(fields=[ StructField("item", StringType()), StructField("recoms", StringType()) ]) df.select(col("true_recoms"),from_json(col("true_recoms"), schema_json)).show(5) +--------+--------------------+------+ | item| true_recoms|true_r| +--------+--------------------+------+ |31746548|{"32731749":3,"31...| [,]| |17359322|{"17359392":1,"17...| [,]| |31480894|{"31480598":1,"31...| [,]| | 7265665|{"7265891":1,"503...| [,]| |31350949|{"32218698":1,"31...| [,]| +--------+--------------------+------+ only showing top 5 rows 

1 Answer 1

5

The schema is incorrectly defined. You declare to be as struct with two string fields

  • item
  • recoms

while neither field is present in the document.

Unfortunately from_json can take return only structs or array of structs so redefining it as

MapType(StringType(), LongType()) 

is not an option.

Personally I would use an udf

from pyspark.sql.functions import udf, explode import json @udf("map<string, bigint>") def parse(s): try: return json.loads(s) except json.JSONDecodeError: pass 

which can be applied like this

df = spark.createDataFrame( [(31746548, """{"5556867":1,"5801144":5,"7397596":21}""")], ("item", "true_recoms") ) df.select("item", explode(parse("true_recoms")).alias("recom_item", "recom_cnt")).show() # +--------+----------+---------+ # | item|recom_item|recom_cnt| # +--------+----------+---------+ # |31746548| 5801144| 5| # |31746548| 7397596| 21| # |31746548| 5556867| 1| # +--------+----------+---------+ 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.