0

I have a dataset that looks similar to this:

Name Status Activity
Jane student yes
John businessman yes
Elle student no
Chris policeman yes
John businessman no
Clay businessman yes

I want to group the dataset by Status and Name which have Activity as a 'yes' and count the Name. If it at least has one 'yes', it will be counted.

Basically, this is the output that I want:

student 1 Jane

businessman 2 John, Clay

policeman 1 Chris

I've tried these codes:

cb = (DataFrame.groupby(['Name', 'Status']).sum(DataFrame['Activity'].eq('yes'))) cb = (DataFrame.groupby(['Name', 'Status']).any(DataFrame['Activity'].eq('yes'))) cb = (DataFrame.groupby(['Name', 'Status']).nunique(DataFrame['Activity'].eq('yes'))) 

but, all of them give this error:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 

Please help me to fix this code. Thank you in advance!

2 Answers 2

2

Example

data = {'Name': {0: 'Jane', 1: 'John', 2: 'Elle', 3: 'Chris', 4: 'John', 5: 'Clay'}, 'Status': {0: 'student', 1: 'businessman', 2: 'student', 3: 'policeman', 4: 'businessman', 5: 'businessman'}, 'Activity': {0: 'yes', 1: 'yes', 2: 'no', 3: 'yes', 4: 'no', 5: 'yes'}} df = pd.DataFrame(data) 

Code

out = (df[df['Activity'].eq('yes')] .groupby('Status', sort=False)['Name'].agg(['count', ', '.join])) 

out

 count join Status student 1 Jane businessman 2 John, Clay policeman 1 Chris 
Sign up to request clarification or add additional context in comments.

2 Comments

thanks for the answer, it works well, but why does when I deploy this code to the real data, it counts the activity number, while I want the number of distinct names that have activity as a yes?
It's my job to solve the examples and yours to put them into your dataset. It is not difficult to get your output using solution. For various reasons, I only solve examples and do not take additional questions. The following function will help you pandas.pydata.org/docs/reference/api/pandas.unique.html
1

Check below:

dd = df.query("Activity != 'no'").\ groupby('Status').\ agg({'Name':[','.join,'count']}).reset_index() dd.columns = ['Status','Names','count'] dd.head() 

Output:

enter image description here

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.