2

My pandas dataframe has a column where each row is a string which corresponds to a filename. I read my data from a JSON file and extract the column like this:

df = pd.read_json("mergedJSON.txt",lines=True,orient='columns') df2 = df.set_index("subject") for key,value in some_dict.iteritems(): df2.loc[value,"file_name"].to_csv(outfile,index=False, header=False) 

I need to drop certain rows from this dataframe based on whether the file is found on disk. Not sure how to do this. Appreciate help.

2 Answers 2

1

Just use this as the last line

df2[df2.file_name.str.contains('stringValue')].loc[value,:].to_csv() 
Sign up to request clarification or add additional context in comments.

Comments

0

First, set_index,reindex use the filename as index,and then do df.drop(filename).

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.