I am learning to process what i consider a complex json structure and am trying to load this into a dataframe. I want a single record for each outcome id. Here is the sample json structure
{ "_id": 12345, "reports": [ { "body": "\n***\nGeneral report text.", "outcome": { "comments": [], "id": "1", "status": { "failed": { "both": 0, "human": 0, "auto": 0, "total": 0 }, "open": { "both": 0, "human": 0, "auto": 0, "total": 0 }, "passed": { "both": 0, "human": 0, "auto": 1, "total": 1 }, "code": { "_input": 0, "_output": 0, "canceled": 0 }, "total": 1 } }, "type": "outcome" }, { "body": "\n***\nGeneral report text.", "outcome": { "comments": [], "id": "2", "status": { "failed": { "both": 0, "human": 0, "auto": 0, "total": 0 }, "open": { "both": 0, "human": 0, "auto": 0, "total": 0 }, "passed": { "both": 0, "human": 0, "auto": 1, "total": 1 }, "code": { "_input": 0, "_output": 0, "canceled": 0 }, "total": 1 } }, "type": "outcome" } ] } The desired format in the dataframe is
report_id | outcome.id | body | outcome.comments | status.failed.both | status.failed.human | status.failed.auto so on and so forth I would like all of the statuses in a single record (by outcome.id) and not sure how to normalize it all into a single record.
I've tried this but it does not give me the desired dataframe structure
df_reports = pd.json_normalize(data,record_path=['reports', 'outcome'], meta=[ ['reports','body'], ['outcome','comment'], ['outcome','id'], ['outcome','status'] ]
idvalues, yet the JSON you show only has a single reportidvalue. Is your parsed JSON actually a list of dictionaries?df_reportsafterjson_normalize. You could also show dataframe with expected result. Maybe it will need to reformat it after usingjson_normalize"\n..."in your string rendering it invalid JSON. I have to assume that what you are showing is not JSON but rather JSON that has already been parsed into a dictionary. Then it is no longer a "json structure".