Named aggregations with multiple columns

Since pandas 0.25.0 we have named aggregations.

Which works fine if you do aggregations on single columns. But what if you want to apply aggregations over multiple columns:

example:

# example dataframe df = pd.DataFrame(np.random.rand(4,4), columns=list('abcd')) df['group'] = [0, 0, 1, 1] a b c d group 0 0.751462 0.572576 0.192957 0.921723 0 1 0.070777 0.801548 0.601678 0.344633 0 2 0.112964 0.361984 0.416241 0.785764 1 3 0.380045 0.486494 0.000594 0.608759 1 # aggregations on single columns df.groupby('group').agg( a_sum=('a', 'sum'), a_mean=('a', 'mean'), b_mean=('b', 'mean'), c_sum=('c', 'sum'), d_range=('d', lambda x: x.max() - x.min()) ) a_sum a_mean b_mean c_sum d_range group 0 0.947337 0.473668 0.871939 0.838150 0.320543 1 0.604149 0.302074 0.656902 0.542985 0.057681

But what if we want to calculate the a.max() - b.max() while aggregating. That does not seem to work. For example, something like this would make sense:

df.groupby('group').agg( diff_a_b=(['a', 'b'], lambda x: x['a'].max() - x['b'].max()) )

So is it possible to do named aggregations on multiple columns? If not, is this in the pipeline for future releases?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Named aggregations with multiple columns #29268

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Named aggregations with multiple columns #29268

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions