2

I am new to pandas and trying to complete the following:

I have a dataframe which look like this:

row A B 1 abc abc 2 abc 3 abc 4 5 abc abc 

My desired output would look like this:

row A B 1 abc abc 2 abc 3 abc 5 abc abc 

I am trying to drop rows if there is no value in both A and B columns:

if finalized_export_cf[finalized_export_cf['A']].str.len()<2: if finalized_export_cf[finalized_export_cf['B']].str.len()<2: finalized_export_cf[finalized_export_cf['B']].drop() 

But that gives me the following error:

ValueError: cannot index with vector containing NA / NaN values 

How could I drop values when both columns have an empty cell? Thank you for your suggestions.

5 Answers 5

4

You can check whether all rows have a null by using .isnull() and all() in a chain. isnull() produces a dataframe with booleans, and all(axis=1) checks whether all values in a given rows are true. If that's the case, that means that all values in the rows are nulls:

inds = df[["A", "B"]].isnull().all(axis=1) 

You can then use inds to clean up all rows that have only nulls. First negate it using the tilda ~, or else you can only missing values:

df = df.loc[~inds, :] 
Sign up to request clarification or add additional context in comments.

1 Comment

I have 10 columns and only need to check 2 of those, how would I proceed with this set up?
2

For your use case you can create a mask and get the values where A & B are not True:

mask = df.isna() df[~((mask.A == True) & (mask.B == True))] 

output:

 row A B 0 1 abc abc 1 2 abc NaN 2 3 NaN abc 4 5 abc abc 

Comments

2

If missing values are NaNs then use DataFrame.dropna with all and subset parameter:

print (df) row A B 0 1 abc abc 1 2 abc NaN 2 3 NaN abc 3 4 NaN NaN 4 5 abc abc df = df.dropna(how='all', subset=['A','B']) print (df) row A B 0 1 abc abc 1 2 abc NaN 2 3 NaN abc 4 5 abc abc 

Or if empty value is empty string use DataFrame.any with compare not equal '':

print (df) row A B 0 1 abc abc 1 2 abc 2 3 abc 3 4 4 5 abc abc df = df[df[['A','B']].ne('').any(axis=1)] print (df) row A B 0 1 abc abc 1 2 abc 2 3 abc 4 5 abc abc 

3 Comments

Hi jezrael, does this drop rows only if both A and B columns are empty or either one of those?
@KenHBS - sure, rows are removed
@JonasPalačionis - It test only A, B columns - all columns specified in subset parameter
1

if you have only two columns - you can use the how attribute of the pandas.dataFrame.dropna by setting it to 'all':

df.dropna(how='all') 

Comments

1

first of all we need to change the blank spaces to NaN

df = df.replace(r'^\s*$',np.nan,regex=True) 

then drop na whilst sub-setting your rows

df.dropna(subset=['A','B'],how='all').fillna(' ') # if you want spaces for na print(df) row A B 0 1 abc abc 1 2 abc 2 3 abc 4 5 abc abc 

1 Comment

@Jonas Palačionis - I suggest you use this answer - this will work

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.