loose column when using .groupby() in panda dataframe

Question

I am trying to use the .groupby() function with panda dataframes, but I keep loosing the column that I am trying to group. I tried to group by the year and it succeeds in doing this but

the column name gets removes so I am unable to call the column. An extra row is added that has the column name, but I am unable to access it. Am I doing something wrong?

for example I ran the code below

stats2 = stats.groupby('yearID').mean()

and I get this as the result

 2B 3B HR BB 1B yearID 1956 0.035939 0.007809 0.024694 0.096666 0.164637 1957 0.036462 0.007220 0.023651 0.087744 0.167484 1958 0.036856 0.007120 0.024353 0.088281 0.166760

any ideas on what I am doing wrong and how I can fix this?

thanks

Woody Pride · Accepted Answer · 2014-10-12 04:09:19Z

use the as_index = False, option when grouping

stats2 = stats.groupby('yearID', as_index = False).mean()

As the other user has made clear, the default behaviour is that the group key becomes the index. This behaviour is prevented by using the option just described.

FooBar · Accepted Answer · 2014-10-12 03:46:13Z

The column you group by becomes an index in the result. That's what you call the "extra column".

If you want to recover that as a column, you should stats2.reset_index().

Collectives™ on Stack Overflow

loose column when using .groupby() in panda dataframe

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related