Pandas Dataframe groupby()

Question

I have a dataset that looks similar to this:

Name	Status	Activity
Jane	student	yes
John	businessman	yes
Elle	student	no
Chris	policeman	yes
John	businessman	no
Clay	businessman	yes

I want to group the dataset by Status and Name which have Activity as a 'yes' and count the Name. If it at least has one 'yes', it will be counted.

Basically, this is the output that I want:

student 1 Jane

businessman 2 John, Clay

policeman 1 Chris

I've tried these codes:

cb = (DataFrame.groupby(['Name', 'Status']).sum(DataFrame['Activity'].eq('yes'))) cb = (DataFrame.groupby(['Name', 'Status']).any(DataFrame['Activity'].eq('yes'))) cb = (DataFrame.groupby(['Name', 'Status']).nunique(DataFrame['Activity'].eq('yes')))

but, all of them give this error:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Please help me to fix this code. Thank you in advance!

Panda Kim · Accepted Answer · 2022-12-26 13:51:22Z

Example

data = {'Name': {0: 'Jane', 1: 'John', 2: 'Elle', 3: 'Chris', 4: 'John', 5: 'Clay'}, 'Status': {0: 'student', 1: 'businessman', 2: 'student', 3: 'policeman', 4: 'businessman', 5: 'businessman'}, 'Activity': {0: 'yes', 1: 'yes', 2: 'no', 3: 'yes', 4: 'no', 5: 'yes'}} df = pd.DataFrame(data)

Code

out = (df[df['Activity'].eq('yes')] .groupby('Status', sort=False)['Name'].agg(['count', ', '.join]))

out

 count join Status student 1 Jane businessman 2 John, Clay policeman 1 Chris

thanks for the answer, it works well, but why does when I deploy this code to the real data, it counts the activity number, while I want the number of distinct names that have activity as a yes?
It's my job to solve the examples and yours to put them into your dataset. It is not difficult to get your output using solution. For various reasons, I only solve examples and do not take additional questions. The following function will help you pandas.pydata.org/docs/reference/api/pandas.unique.html

Abhishek · Accepted Answer · 2022-12-26 15:07:47Z

Check below:

dd = df.query("Activity != 'no'").\ groupby('Status').\ agg({'Name':[','.join,'count']}).reset_index() dd.columns = ['Status','Names','count'] dd.head()

Output:

Collectives™ on Stack Overflow

Pandas Dataframe groupby()

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related