0

I'm pretty new to Python and I just encountered a problem.

mini_agg is my original pandas.dataframe and I'm trying to group it by 2 columns.

trial = mini_agg.groupby(['date','product','product_type_1','product_type_2','product_type_3','product_type_4']).sum() print mini_agg.shape print trial.shape 

output:

(2965909, 10)
(499281, 4)

Furthermore I cannot access the keys by which I grouped by. In R I do obtain my column back when using aggregate.

Can you please help me? Thank you in advance

1
  • Please include the mini_agg values to your provided code Commented Oct 20, 2016 at 13:08

2 Answers 2

1

How to GroupBy a Dataframe in Pandas and keep Columns

Just found the answer I didn't find with my previous queries:

trial = mini_agg.groupby(['date','product','product_type_1','product_type_2','product_type_3','product_type_4']).sum().reset_index() 

It is sufficient to add .reset_index()

Sign up to request clarification or add additional context in comments.

Comments

1

I expected mini_agg values to be provided however I suppose it's a combination of two one-dimensional labeled data structures. So as you mentioned mini_agg is a pandas.dataframe and as you must know DataFrame Like Series has a possibility to accept another DataFrame as input:

Therefore, If mini_agg to be like:

import pandas as pd FRAME= {'one' : pd.Series([1., 2., 3.], index=['product_type_1', 'product_type_2', 'product_type_3']), 'two' : pd.Series([1., 2., 3., 4.], index=['product_type_1', 'product_type_2', 'product_type_3', 'product_type_4'])} mini_agg = pd.DataFrame(FRAME) 

So,

trial = pd.DataFrame(mini_agg, index=['date','product','product_type_1','product_type_2','product_type_3','product_type_4'], columns=['A', 'B', 'C', 'D', 'E', 'F']) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.