1

in a dataframe df how can I find the columns that contains all nan after grouping the rows?

In [97]: df Out[97]: a b group 0 NaN NaN a 1 0.0 NaN a 2 2.0 NaN a 3 1.0 7.0 b 4 1.0 3.0 b 5 7.0 4.0 b 6 2.0 6.0 c 7 9.0 6.0 c 8 3.0 0.0 c 9 9.0 0.0 c 

in this case the desired output should be group: a - columns: b

3 Answers 3

1

Use set_index by grouping column first, then find all NaNs by isnull.

Then groupby and aggregate all. Last reshape by stack and create new DataFrame with all groups and columns names:

print (df.set_index('group').isnull().groupby('group').all()) a b group a False True b False False c False False 

a = df.set_index('group').isnull().groupby('group').all().stack() b = pd.DataFrame(a[a].index.values.tolist(), columns=['group','cols']) print (b) group cols 0 a b 
Sign up to request clarification or add additional context in comments.

1 Comment

I try create new df with output.
0

try this ?

df.groupby('group').sum().unstack()[df.groupby('group').sum().unstack().isnull()].reset_index() level_0 group 0 0 b a NaN 

Comments

0

Are you looking for this ? i.e get the group name and the value column that as full Nan values

vals = [(i['group'].iloc[0],i.columns[i.isnull().all()].tolist()) for _,i in df.groupby('group')] 

Output:

 [('a', ['b']), ('b', []), ('c', [])] 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.