pandas how to insert value from another row

Question

I have a dataframe:

import pandas as pd data = {'fruit': ['pear','pear','banana', 'pear', 'pear','apple', 'apple', 'cherry','cherry'], 'fruit_type': ['unknown','pear','unknown', 'unknown', 'pear','unknown', 'apple', 'cherry','unknown'], 'country': ['unknown','usa', 'unknown', 'unknown','ghana','unknown', 'russia', 'albania','unknown'], 'id': ['011','011','011', '011', '011','011', '011', '6','6'], 'month': ['unknown','march', 'unknown', 'unknown', 'january','unknown', 'march', 'january','unknown'] } df = pd.DataFrame(data, columns = ['fruit','fruit_type','country', 'id', 'month'])

I want to fill rows where is unknown with value from another row for each group by id:

If we have an unknown value in the month column in the first place in the group by id we need to insert unknown values from the next row

If an unknown value in the month column not in the first place in the group by id we need to insert unknown values from the previous row

Can anyone see the problem?

Output dataframe:

what have you tried so far?

alec_djinn
– alec_djinn

2021-07-26 09:27:13 +00:00
Commented Jul 26, 2021 at 9:27 — alec_djinn
– alec_djinn, Commented Jul 26, 2021 at 9:27

Anurag Dabas · Accepted Answer · 2021-07-26 09:47:26Z

use replace() for replacing 'unknown' to NaN then groupby 'id' and then forward fill and then backword fill and filnally assign the result back to df:

df=df.replace('unknown',float('nan')) #If above replace doesn't work then use: #df=df.replace('unknown',float('nan'),regex=True) df=df.groupby('id').apply(lambda x:x.ffill().bfill())

output of df:

 fruit fruit_type country id month 0 pear pear usa 011 march 1 pear pear usa 011 march 2 banana pear usa 011 march 3 pear pear usa 011 march 4 pear pear ghana 011 january 5 apple pear ghana 011 january 6 apple apple russia 011 march 7 cherry cherry albania 6 january 8 cherry cherry albania 6 january

jezrael · Accepted Answer · 2021-07-26 09:38:34Z

Replace unknown to missing values and then forward and backward missing values per groups:

f = lambda x: x.ffill().bfill() df = df.replace('unknown', np.nan).groupby(df['id']).transform(f) print (df) fruit fruit_type country id month 0 pear pear usa 011 march 1 pear pear usa 011 march 2 banana pear usa 011 march 3 pear pear usa 011 march 4 pear pear ghana 011 january 5 apple pear ghana 011 january 6 apple apple russia 011 march 7 cherry cherry albania 6 january 8 cherry cherry albania 6 january

Collectives™ on Stack Overflow

pandas how to insert value from another row

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related