0

Input dataframe

data = { 'org_id' :[79,80,21,36,40,7,10,9,12,24], 'r_id' : [79,80,20,20,20,7,7,9,12,12], 'Type_id' : ['P','P','C','C','C','P','C','P','P','C'], 'grp_id': ['g54','g55','g13','g13','g13','g6','g6','g7','g8','g8'] } df2 = pd.DataFrame.from_dict(data) df2 Out[271]: org_id r_id Type_id grp_id 0 79 79 P g54 1 80 80 P g55 2 21 20 C g13 3 36 20 C g13 4 40 20 C g13 5 7 7 P g6 6 10 7 C g6 7 9 9 P g7 8 12 12 P g8 9 24 12 C g8 

output dataframe

data = {

'org_id' :[79,80,21,36,40,7,10,9,12,24], 'r_id' : [79,80,20,20,20,7,7,9,12,12], 'Type_id' : ['C','C','C','C','C','P','C','C','P','C'], 'grp_id': ['g54','g55','g13','g13','g13','g6','g6','g7','g8','g8'] } df3 = pd.DataFrame.from_dict(data) df3 

Expected output:

 Out[270]: org_id r_id Type_id grp_id 0 79 79 C g54 1 80 80 C g55 2 21 20 C g13 3 36 20 C g13 4 40 20 C g13 5 7 7 P g6 6 10 7 C g6 7 9 9 C g7 8 12 12 P g8 9 24 12 C g8 

Based on the group count on column(grp_id) if the value is equal to one(1) then change the type_id to 'C'

Eg. g54 and g55 has only one occurrence, hence the type_id='C', g13 and g6 appears more than once , so I don't change that type. Thanking you.

1 Answer 1

1

Use Series.where with mask by Series.duplicated with keep=False for all duplicates:

#get all duplicated rows by grp_id mask = df2['grp_id'].duplicated(keep=False) #compare counts by not equal 1 #mask = df2.groupby('grp_id')['grp_id'].transform('size').ne(1) df2['Type_id'] = df2['Type_id'].where(mask, 'C') print (df2) org_id r_id Type_id grp_id 0 79 79 C g54 1 80 80 C g55 2 21 20 C g13 3 36 20 C g13 4 40 20 C g13 5 7 7 P g6 6 10 7 C g6 7 9 9 C g7 8 12 12 P g8 9 24 12 C g8 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.