Sorting a Pandas Dataframe by Multiple Columns
Sorting a Pandas dataframe by multiple columns is a common operation in data analysis. Consider a dataframe with columns 'a', 'b', and 'c'. To sort this dataframe by column 'b' in ascending order and column 'c' in descending order, follow these steps:
Starting from Pandas version 0.17.0, the sort method has been deprecated in favor of sort_values. As of version 0.20.0, sort has been completely removed. However, the arguments and results remain unchanged:
df.sort_values(['a', 'b'], ascending=[True, False])
An equivalent way using the deprecated sort method is:
df.sort(['a', 'b'], ascending=[True, False])
For example, consider a dataframe df1 with random integer values in columns 'a' and 'b':
import pandas as pd import numpy as np df1 = pd.DataFrame(np.random.randint(1, 5, (10, 2)), columns=['a', 'b'])
Sorting this dataframe by 'a' in ascending order and 'b' in descending order gives:
df1.sort(['a', 'b'], ascending=[True, False])
a b 2 1 4 7 1 3 1 1 2 3 1 2 4 3 2 6 4 4 0 4 3 9 4 3 5 4 1 8 4 1
Remember that the sort method is not in-place by default. To update df1 with the sorted values, assign the result of the sort method to df1 or use inplace=True in the method call:
df1 = df1.sort(['a', 'b'], ascending=[True, False])
or
df1.sort(['a', 'b'], ascending=[True, False], inplace=True)
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3