2

I am trying to count the indices of labels in Pandas DataFrame in each column. Basically I have the following DataFrame:

d = {'col1': ['label1', 'label2', 'label3'], 'col2': ['label2', 'label3', 'label1'], 'col3': ['label2', 'label1', 'label3'], 'col4': ['label3', 'label1', 'label2']} df = pd.DataFrame(data = d) 

which formats as:

 col1 col2 col3 col4 0 label1 label2 label2 label3 1 label2 label3 label1 label1 2 label3 label1 label3 label2 

The idea would be to the count the indices of each label over all the columns into an array (or dataframe) as follows:

 label1 label2 label3 0 1 2 1 1 2 1 1 2 1 1 2 

This tells that, for example, label1 appears once at index 0, twice at index 1 and once at index 2 in the original DataFrame.

I am performing this operation inside a loop so an efficient method would be preferred. Any ideas?

1 Answer 1

3

Use:

df = df.apply(pd.value_counts, axis=1) print (df) label1 label2 label3 0 1 2 1 1 2 1 1 2 1 1 2 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.