In this tutorial, we will learn how to sort a Dataframe in Python based on a specific column. We will be using the Pandas library for this task, which provides simple yet powerful data structures for data analysis in Python.
The ability to sort data is crucial when managing large datasets, as it enables us to quickly view, analyze, and extract insights from it.
Step 1: Import the Pandas library
In order to sort a Dataframe, the first step is to import the Pandas library. If you do not have Pandas installed, you can install it by running !pip install Pandas
in your terminal or Jupyter Notebook. Once installed, you can import the library with the following line:
1 |
import pandas as pd |
Step 2: Create a Dataframe
Now, we will create a simple Dataframe to perform the sorting operation on. You can either create a Dataframe from scratch or load it from an existing file (such as CSV or Excel).
For this tutorial, let’s create a simple example Dataframe from scratch with a dictionary:
1 2 3 4 5 |
data = {'Name': ['John', 'Eva', 'Sam', 'Nick', 'Maria'], 'Age': [34, 43, 23, 29, 27], 'Score': [88, 75, 94, 56, 63]} df = pd.DataFrame(data) |
Here’s what the example Dataframe looks like:
Name Age Score 0 John 34 88 1 Eva 43 75 2 Sam 23 94 3 Nick 29 56 4 Maria 27 63
Step 3: Sort the Dataframe based on the column
To sort the Dataframe based on a specific column, we will use the sort_values()
function. This function takes several parameters, but the two most important ones are:
- by: The column name or the column index by which you want to sort the Dataframe.
- ascending: A boolean value to indicate whether to sort the column in ascending order (default=True) or descending order (False).
Let’s sort our example Dataframe based on the ‘Score’ column:
1 |
sorted_df = df.sort_values(by='Score') |
Here’s the sorted Dataframe:
Name Age Score 3 Nick 29 56 4 Maria 27 63 1 Eva 43 75 0 John 34 88 2 Sam 23 94
Step 4: Sort the Dataframe based on the column in descending order
If you want to sort the Dataframe in descending order, you can set the ascending
parameter to False
. Here’s the example to sort our Dataframe based on the ‘Score’ column in descending order:
1 |
sorted_df_desc = df.sort_values(by='Score', ascending=False) |
Here’s the sorted Dataframe in descending order:
Name Age Score 2 Sam 23 94 0 John 34 88 1 Eva 43 75 4 Maria 27 63 3 Nick 29 56
Full code
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import pandas as pd data = {'Name': ['John', 'Eva', 'Sam', 'Nick', 'Maria'], 'Age': [34, 43, 23, 29, 27], 'Score': [88, 75, 94, 56, 63]} df = pd.DataFrame(data) sorted_df = df.sort_values(by='Score') sorted_df_desc = df.sort_values(by='Score', ascending=False) print(sorted_df) print(sorted_df_desc) |
Conclusion
In this tutorial, we learned how to sort a Dataframe based on a specific column using the Pandas library in Python. We saw how to sort the Dataframe in both ascending and descending order. Sorting Dataframes is essential for data analysis and allows us to easily view and extract insights from the data.