0

I have below json string loaded to dataframe. Now I want to filter the record based on ossId.

The condition I have is giving the error message. what is the correct way to filter by ossId?

import pandas as pd data = """ { "components": [ { "ossId": 3946, "project": "OALX", "licenses": [ { "name": "BSD 3", "status": "APPROVED" } ] }, { "ossId": 3946, "project": "OALX", "version": "OALX.client.ALL", "licenses": [ { "name": "GNU Lesser General Public License v2.1 or later", "status": "APPROVED" } ] }, { "ossId": 2550, "project": "OALX", "version": "OALX.webservice.ALL" , "licenses": [ { "name": "MIT License", "status": "APPROVED" } ] } ] } """ df = pd.read_json(data) print(df) df1 = df[df["components"]["ossId"] == 2550] 

3 Answers 3

2

I think your issue is due to the json structure. You are actually loading into df a single row that is the whole list of field component.

You should instead pass to the dataframe the list of records. Something like:

json_data = json.loads(data) df = pd.DataFrame(json_data["components"]) filtered_data = df[df["ossId"] == 2550] 
Sign up to request clarification or add additional context in comments.

Comments

1

You need to go into the cell's data and get the correct key:

df[df['components'].apply(lambda x: x.get('ossId')==2550)] 

Comments

1

Use str

df[df.components.str['ossId']==2550] Out[89]: components 2 {'ossId': 2550, 'project': 'OALX', 'version': ... 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.