I am trying to count the indices of labels in Pandas DataFrame in each column. Basically I have the following DataFrame:
d = {'col1': ['label1', 'label2', 'label3'], 'col2': ['label2', 'label3', 'label1'], 'col3': ['label2', 'label1', 'label3'], 'col4': ['label3', 'label1', 'label2']} df = pd.DataFrame(data = d) which formats as:
col1 col2 col3 col4 0 label1 label2 label2 label3 1 label2 label3 label1 label1 2 label3 label1 label3 label2 The idea would be to the count the indices of each label over all the columns into an array (or dataframe) as follows:
label1 label2 label3 0 1 2 1 1 2 1 1 2 1 1 2 This tells that, for example, label1 appears once at index 0, twice at index 1 and once at index 2 in the original DataFrame.
I am performing this operation inside a loop so an efficient method would be preferred. Any ideas?