1

I have a table

df = pd.DataFrame({'car': ['toyota', 'toyota', 'ford', 'ford'], 'doors': [nan, 2.0, nan, 4.0], 'seats': [2.0, nan, 4.0, nan]}) 

that looks like this:

car doors seats
toyota NaN 2
toyota 2 NaN
ford NaN 4
ford 4 NaN

I want to replace NaN with values from rows that match a value from a specific column (i.e car)

I want this:

car doors seats
toyota 2 2
ford 4 4
2
  • What happened to rows? There are 4 rows to begin with but only 2 row in the output. What are the rules for that? Commented Jan 7, 2022 at 9:03
  • Also, what if there are 3 rows for toyota? How would it behave then? Commented Jan 7, 2022 at 9:05

2 Answers 2

2

Another option is to use groupby_first method. first method skips NaN values by default.

out = df.groupby('car', as_index=False).first() 

Output:

 car doors seats 0 ford 4.0 4.0 1 toyota 2.0 2.0 
Sign up to request clarification or add additional context in comments.

Comments

0

Suppose Your Dataframe name is Cars_df, grouping and taking maximum value should work, like below

 Cars_df.groupby(['car'])['door','seat'].max().reset_index() 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.