2

I have the following dataframe (50000 lines for 2016,2017,2018):

data.head() Values Date 2016-07-20 10.0 2016-07-20 10.0 2016-07-22 1.0 2016-07-22 1.0 2016-07-22 1.0 

simple groupby :

data.groupby([(data1.index.year==2018),(data1.index.month==5)])['Values'].sum() False False 1461787.35 True 80714.53 True False 862589.66 True 172600.04 Name: Values, dtype: float64 

How can I change False and True in year and month to 2018 and 5 ?

Output expected :

 2018 5 80714.53 Name: Values, dtype: float64 
2
  • Sorry, what is expected output? Commented Dec 22, 2018 at 15:31
  • The same but with numbers in place of (false/true) Commented Dec 22, 2018 at 15:33

1 Answer 1

2

You need grouping by years and months only:

#changed data for better sample print (data) Values Date 2015-07-20 10.0 2015-05-20 20.0 2016-05-20 2.0 2018-05-22 1.0 2018-05-22 3.0 2018-07-22 4.0 df1 = data.groupby([data.index.year, data.index.month])['Values'].sum() print (df1) Date Date 2015 5 20.0 7 10.0 2016 5 2.0 2018 5 4.0 7 4.0 Name: Values, dtype: float64 

Your solution grouping by boolean mask, True are filtered only 2018 years and Falses are all non 2018 rows, similar for months:

df2 = data.groupby([(data.index.year==2018),(data.index.month==5)])['Values'].sum() print (df2) False False 10.0 True 22.0 True False 4.0 True 4.0 Name: Values, dtype: float64 

If want only filtered values is possible use:

df = data[(data.index.year==2018) & (data.index.month==5)] print (df) Values Date 2018-05-22 1.0 2018-05-22 3.0 out = df.groupby([df.index.year, df.index.month])['Values'].sum() print (out) Date Date 2018 5 4.0 Name: Values, dtype: float64 
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, that's what i ' m looking for
@Math - You are welcome! Also added samples for better understand difference.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.