How can I sort a Pandas DataFrame according to multiple columns, with some of the columns being sorted in ascending order while other columns are sorted in descending order?
Furthermore, assume that the columns that will be used for sorting have too many unique values to be listed one by one for a sorting key or the pd.Categorical function (as suggested in this thread).
Here's a small reproducible example:
import pandas as pd
my_df = pd.DataFrame({'col1':['a','a','a','a','b','b','b','b','c','c','c','c'], 'col2':[1,1,2,2,1,1,2,2,1,1,2,2], 'col3':[1,2,1,2,1,2,1,2,1,2,1,2]})Suppose that, in the example above, I wanted to sort my_df as follows:
col1in Descending ordercol2in Descending ordercol3in Ascending order
1 Answer
The DataFrame.sort_values method can handle this very easily. Just use the ascending argument and provide a list of boolean values.
import pandas as pd
my_df = pd.DataFrame({'col1':['a','a','a','a','b','b','b','b','c','c','c','c'], 'col2':[1,1,2,2,1,1,2,2,1,1,2,2], 'col3':[1,2,1,2,1,2,1,2,1,2,1,2]})
my_df = my_df.sort_values(by=['col1','col2','col3'], ascending=[False, False, True])Note that the list provided in the ascending argument must have the same length as the one provided in the by argument.