1

How to concat unique values of some text columns of a pandas dataframe in to a single column. For example:

data = [[1,"US","California","Los Angeles"], [1,"US","California","San Francisco"], [1,"US","California","San Diego"], [1,"US","Texas","Austin"], [2,"IND","Maharashtra","Mumbai"], [2,"IND","Maharashtra","Pune"], [2,"IND","Maharashtra","Nagpur"]] df = pd.DataFrame(data, columns = ['Country_Id', 'Country','State','Place']) 

From above dataframe, how do I generate output with one field as Country_Id and second with a text field containing the unique values of Country, State, Place.

Like:

  • 1,US California Texas Los Angeles San Francisco San Diego Austin
  • 2,IND Maharashtra Mumbai Pune Nagpur

Please ignore the meaning of the combined text field

1

1 Answer 1

2

Use groupby and apply with double join on unique and genexp

df.groupby('Country_Id').apply(lambda x: ' '.join(' '.join(x[col].unique()) for col in x)) .to_frame('Country-State-Place') Out[434]: Country-State-Place Country_Id 1 US California Texas Los Angeles San Francisco San Diego Austin 2 IND Maharashtra Mumbai Pune Nagpur 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.