Python, Pandas to remove rows with specific value in designated columns

Question

An Excel spreadsheet looked like below.

With Pandas, I want to remove the columns “Project C” and “Project E”, and all rows with value “XX” in Columns “Project A” and “Project D”.

import pandas as pd import numpy as np work_file = "C:\\test.xlsx" df = pd.read_excel(work_file, sheetname = "Sheet1", index_col = 0) column_list_to_remove = [“Project C", “Project E"] results1 = df.drop(column_list_to_remove, axis=1) writer = pd.ExcelWriter("C:\\test new.xlsx") pd.formats.format.header_style = None results1.to_excel(writer,'Sheet1') writer.save()

Above work well in removing the columns.

I try added on this line to remove rows with value “XX” in “Project A”, and failed. Without hope to remove those in “Project D”.

results1 = results1[results1."Project A" != "XX"] # SyntaxError: invalid syntax

How can I remove those rows? Thank you.

Uvar · Accepted Answer · 2017-09-22 09:24:24Z

If your column names did not have whitespace, you could have done something along the lines of:

results1 = results1.query("ColumnName != 'XX'")

or alternatively inplace:

results1.query("ColumnName != 'XX'", inplace=True)

Alas, they do. Now you can either rename those columns, as whitespace is evil incarnate, or do:

results1 = results1[results1['Project A'] != 'XX']

Basically: create a mask for the index and slice the dataframe.

Renaming your columns without the whitespace can be done in a oneliner as well:

results1.columns = [''.join(x.strip().split()) for x in results1.columns]

Then you can proceed with something like:

column_list = ['ProjectA', 'ProjectD'] for col in column_list: results1.query(col+"!='XX'", inplace=True)

thank you. So if I don't want to rename the columns, I just need to repeat like results1 = results1[results1['Project D'] != 'XX'].
Yes; and of course you can put that in a loop or comprehension as well if you have a lot of these columns and you do not wish to copy paste the same line over and over.

Collectives™ on Stack Overflow

Python, Pandas to remove rows with specific value in designated columns

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related