0

I have a dataframe that looks like

name performance year bob 50 2002 bob 90 2005 bob 82 2010 joey 50 2015 joey 85 2013 joey 37 1990 sarah 90 1994 sarah 95 2020 sarah 35 2013 

I would like groupby name and compute average performance while only displaying the top two results in descending order by performance.

I am currently doing df.groupby(['name']).mean() but this computes the averages of both performance as well as year while displaying all 3 names in alphabetical order (I would only like to display the top 2 in descending order by performance avg).

1
  • 1
    df.groupby('name')['performance'].mean().sort_values(ascending=False).head(2) maybe? Commented Nov 28, 2022 at 3:17

1 Answer 1

2

here is my solution, basically was missing one field in the group by method.

Code:

import pandas as pd # defining columns cols = ['name', 'performance', 'year'] # defining data data = [ ['bob', 50, 2002] , ['bob', 90, 2005] , ['bob', 82, 2010] , ['joey', 50, 2015] , ['joey', 85, 2013] , ['joey', 37, 1990] , ['sarah', 90, 1994] , ['sarah', 95, 2020] , ['sarah', 35, 2013] ] # create dataframe df = pd.DataFrame(data, columns=cols) # dataframe, grouped by name and year, aggregated by mean() of performance, first 2 values in descending order df = df.groupby(['name', 'year'])['performance'].mean().sort_values(ascending=False).head(2) # resetting index to display performance column name df = df.reset_index() # print dataframe print(df) 

Output:

 name year performance 0 sarah 2020 95.0 1 bob 2005 90.0 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.