3

Given a dataframe with similar columns having null values in between. How to dynamically fill nulls in the columns with non-null values from other columns without explicitly stating the names of other column names e.g. select first column category1 and fill the null rows with values from other columns of same rows?

data = {'year': [2010, 2011, 2012, 2013, 2014, 2015, 2016,2017, 2018, 2019], 'category1': [None, 21, None, 10, None, 30, 31,45, 23, 56], 'category2': [10, 21, 20, 10, None, 30, None,45, 23, 56], 'category3': [10, 21, 20, 10, None, 30, 31,45, 23, 56],} df = pd.DataFrame(data) df = df.set_index('year') df category1 category2 category3 year 2010 NaN 10 10 2011 21 21 21 2012 NaN 20 20 2013 10 10 10 2014 NaN NaN NaN 2015 30 30 NaN 2016 31 NaN 31 2017 45 45 45 2018 23 23 23 2019 56 56 56 

After filling category1:

category1 category2 category3 year 2010 10 10 10 2011 21 21 21 2012 20 20 20 2013 10 10 10 2014 NaN NaN NaN 2015 30 30 NaN 2016 31 NaN 31 2017 45 45 45 2018 23 23 23 2019 56 56 56 
0

3 Answers 3

1

You can use first_valid_index with condition if all values are NaN:

def f(x): if x.first_valid_index() is None: return None else: return x[x.first_valid_index()] df['a'] = df.apply(f, axis=1) print (df) category1 category2 category3 a year 2010 NaN 10.0 10.0 10.0 2011 21.0 21.0 21.0 21.0 2012 NaN 20.0 20.0 20.0 2013 10.0 10.0 10.0 10.0 2014 NaN NaN NaN NaN 2015 30.0 30.0 30.0 30.0 2016 31.0 NaN 31.0 31.0 2017 45.0 45.0 45.0 45.0 2018 23.0 23.0 23.0 23.0 2019 56.0 56.0 56.0 56.0 
Sign up to request clarification or add additional context in comments.

1 Comment

thanks @jezrael, I updated the question. I meant to dynamically fill with other columns without explicitly stating the other columns
1

IIUC you can do it this way:

In [369]: df['category1'] = df['category1'].fillna(df['category2']) In [370]: df Out[370]: category1 category2 category3 year 2010 10.0 10.0 10.0 2011 21.0 21.0 21.0 2012 20.0 20.0 20.0 2013 10.0 10.0 10.0 2014 NaN NaN NaN 2015 30.0 30.0 30.0 2016 31.0 NaN 31.0 2017 45.0 45.0 45.0 2018 23.0 23.0 23.0 2019 56.0 56.0 56.0 

3 Comments

how to dynamically fill, as the other column names can be unknown?
@user1128088, i don't get it, can you make an example?
..without hardcoding it to category2
0

Try This:

df['category1']= df['category1'].fillna(df.median(axis=1)) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.