pandas.DataFrame.sort_values#
- DataFrame.sort_values(by, *, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last', ignore_index=False, key=None)[source]#
- Sort by the values along either axis. - Parameters:
- bystr or list of str
- Name or list of names to sort by. - if axis is 0 or ‘index’ then by may contain index levels and/or column labels. 
- if axis is 1 or ‘columns’ then by may contain column levels and/or index labels. 
 
- axis“{0 or ‘index’, 1 or ‘columns’}”, default 0
- Axis to be sorted. 
- ascendingbool or list of bool, default True
- Sort ascending vs. descending. Specify list for multiple sort orders. If this is a list of bools, must match the length of the by. 
- inplacebool, default False
- If True, perform operation in-place. 
- kind{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, default ‘quicksort’
- Choice of sorting algorithm. See also - numpy.sort()for more information. mergesort and stable are the only stable algorithms. For DataFrames, this option is only applied when sorting on a single column or label.
- na_position{‘first’, ‘last’}, default ‘last’
- Puts NaNs at the beginning if first; last puts NaNs at the end. 
- ignore_indexbool, default False
- If True, the resulting axis will be labeled 0, 1, …, n - 1. 
- keycallable, optional
- Apply the key function to the values before sorting. This is similar to the key argument in the builtin - sorted()function, with the notable difference that this key function should be vectorized. It should expect a- Seriesand return a Series with the same shape as the input. It will be applied to each column in by independently. The values in the returned Series will be used as the keys for sorting.
 
- Returns:
- DataFrame or None
- DataFrame with sorted values or None if - inplace=True.
 
 - See also - DataFrame.sort_index
- Sort a DataFrame by the index. 
- Series.sort_values
- Similar method for a Series. 
 - Examples - >>> df = pd.DataFrame( ... { ... "col1": ["A", "A", "B", np.nan, "D", "C"], ... "col2": [2, 1, 9, 8, 7, 4], ... "col3": [0, 1, 9, 4, 2, 3], ... "col4": ["a", "B", "c", "D", "e", "F"], ... } ... ) >>> df col1 col2 col3 col4 0 A 2 0 a 1 A 1 1 B 2 B 9 9 c 3 NaN 8 4 D 4 D 7 2 e 5 C 4 3 F - Sort by a single column - In this case, we are sorting the rows according to values in - col1:- >>> df.sort_values(by=["col1"]) col1 col2 col3 col4 0 A 2 0 a 1 A 1 1 B 2 B 9 9 c 5 C 4 3 F 4 D 7 2 e 3 NaN 8 4 D - Sort by multiple columns - You can also provide multiple columns to - byargument, as shown below. In this example, the rows are first sorted according to- col1, and then the rows that have an identical value in- col1are sorted according to- col2.- >>> df.sort_values(by=["col1", "col2"]) col1 col2 col3 col4 1 A 1 1 B 0 A 2 0 a 2 B 9 9 c 5 C 4 3 F 4 D 7 2 e 3 NaN 8 4 D - Sort in a descending order - The sort order can be reversed using - ascendingargument, as shown below:- >>> df.sort_values(by="col1", ascending=False) col1 col2 col3 col4 4 D 7 2 e 5 C 4 3 F 2 B 9 9 c 0 A 2 0 a 1 A 1 1 B 3 NaN 8 4 D - Placing any - NAfirst- Note that in the above example, the rows that contain an - NAvalue in their- col1are placed at the end of the dataframe. This behavior can be modified via- na_positionargument, as shown below:- >>> df.sort_values(by="col1", ascending=False, na_position="first") col1 col2 col3 col4 3 NaN 8 4 D 4 D 7 2 e 5 C 4 3 F 2 B 9 9 c 0 A 2 0 a 1 A 1 1 B - Customized sort order - The - keyargument allows for a further customization of sorting behaviour. For example, you may want to ignore the letter’s case when sorting strings:- >>> df.sort_values(by="col4", key=lambda col: col.str.lower()) col1 col2 col3 col4 0 A 2 0 a 1 A 1 1 B 2 B 9 9 c 3 NaN 8 4 D 4 D 7 2 e 5 C 4 3 F - Another typical example is natural sorting. This can be done using - natsortpackage, which provides a function to generate a key to sort data in their natural order:- >>> df = pd.DataFrame( ... { ... "hours": ["0hr", "128hr", "0hr", "64hr", "64hr", "128hr"], ... "mins": [ ... "10mins", ... "40mins", ... "40mins", ... "40mins", ... "10mins", ... "10mins", ... ], ... "value": [10, 20, 30, 40, 50, 60], ... } ... ) >>> df hours mins value 0 0hr 10mins 10 1 128hr 40mins 20 2 0hr 40mins 30 3 64hr 40mins 40 4 64hr 10mins 50 5 128hr 10mins 60 >>> from natsort import natsort_keygen >>> df.sort_values( ... by=["hours", "mins"], ... key=natsort_keygen(), ... ) hours mins value 0 0hr 10mins 10 2 0hr 40mins 30 4 64hr 10mins 50 3 64hr 40mins 40 5 128hr 10mins 60 1 128hr 40mins 20