I have this API response:
{ '02/09/2021': { 'ABC': { 'emp': 'A1', 'value': '12421' }, 'DEF': { 'emp': 'D1', 'value': '3345' }, 'GHI': { 'emp': 'G2', 'value': '260048836600' }, 'JKL': { 'emp': 'J1', 'value': '66654654' } } } and would like to normalize a table in this format:
CODE | EMP | VALUE | DATE ======================================== ABC | A1 | 12421 | 02/09/2021 DEF | D1 | 3445 | 02/09/2021 GHI | G2 | 260048836600 | 02/09/2021 JKL | J1 | 66654654 | 02/09/2021 I tried to make an explode but I couldn't do it, how can I get this result?
To reproduce it:
import json api_response = {'02/09/2021':{'ABC':{'emp':'A1','value':'12421'},'DEF':{'emp':'D1','value':'3345'},'GHI':{'emp':'G2','value':'260048836600'},'JKL':{'emp':'J1','value':'66654654'}}} rdd = spark.sparkContext.parallelize([json.dumps(api_response)]) input_df = spark.read.json(rdd)
[dataframe].limit(3).collect()and get the schema[dataframe].schema.simpleString()where [dataframe] is your variable name. It will be easier to reproduce it