4

I have a pandas DataFrame that I need to sort in a particular order in one column, and just ascending in another. Both columns have repeated values. It looks more or less like this:

import pandas as pd df = pd.DataFrame() df[0] = pd.Series( [ 'a', 'aa', 'c' ] * 2 ) df[1] = pd.Series( [ 1, 2 ] * 3 ) df[2] = pd.Series( range(6) ) print( df ) 0 1 2 0 a 1 0 1 aa 2 1 2 c 1 2 3 a 2 3 4 aa 1 4 5 c 2 5 

Now, say that I need to order by columns 0 and 1, but not alphabetically: Column 0 should first follow an order:

order = [ 'a', 'c', 'aa' ] 

How do I do that?

I would like to have it sorted like this:

print( sorted_df ) 0 1 2 0 a 1 0 1 a 2 3 2 c 1 2 3 c 2 5 4 aa 1 4 5 aa 2 1 

Using python 3.5.2, pandas 0.18.1

1
  • If you want it like that you could create an additional columns that holds the length of your strings in your 0-column and then sort first by length of strings and then by the strings themselves. Commented Feb 24, 2017 at 10:46

1 Answer 1

11

You can use pandas' categorical Series for this purpose which supplies the functionality of an individual sort order:

 df[0] = pd.Categorical(df[0], order) print(df.sort_values([0, 1])) 0 1 2 0 a 1 0 3 a 2 3 2 c 1 2 5 c 2 5 4 aa 1 4 1 aa 2 1 
Sign up to request clarification or add additional context in comments.

1 Comment

Hi, this one can help only if the 'order' given has unique elements. But how to custom sort if i want a column (say) from [1, 2, 5, 6, 1, 2, 3, 4] order to [1, 2, 3, 4, 5, 6, 1, 2] ? Basically if want I want to sort the dataframe by cyclic list given.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.