1

I am currently doing some exercises on a Pandas DataFrame indexed by date (DD/MM/YY). The current exercise requires me to groupby on Year to obtain average yearly values. So what I tried to do was to create a new column containing only the years extracted from the DataFrame's index. The code I wrote is:

data["year"] = [t.year for t in data.index] data.groupby("year").mean() 

but for some reason, the new column "year" ends up replacing the previous full-date indexing (which does not even become a "standard" column, it plain disappears), which came a bit by surprise. How can this be?

Thanks in advance!

1
  • Can you include a sample of your dataframe in your question? Commented Nov 2, 2018 at 23:31

1 Answer 1

2

For a sample dataframe:

 value 2016-01-22 1 2014-02-02 2 2014-08-27 3 2016-01-23 4 2014-03-18 5 

If you would like to keep your logic, you just need to call the column you want to take the mean() of and use transform() and then assign it back to the value column:

data['year'] = [t.year for t in data.index] data['value'] = data.groupby('year')['value'].transform('mean') 

Yields:

 value year 2016-01-22 2.500000 2016 2014-02-02 3.333333 2014 2014-08-27 3.333333 2014 2016-01-23 2.500000 2016 2014-03-18 3.333333 2014 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.