In this tutorial, we will learn how to sort a column in Excel using Python. We will be using the popular library Pandas to achieve this task. Pandas is widely used for data manipulation and analysis. It provides data structures and functions required to work with structured data seamlessly.
Step 1: Install Pandas
If you don’t have Pandas already installed, you can install it using pip. Open your command prompt or terminal and type the following command:
1 |
pip install pandas |
Step 2: Prepare the Excel file
For this tutorial, we will use a sample Excel file named “sample_data.xlsx” with the following columns and data:
Name Age Country John 35 USA Alice 28 UK David 28 Australia Charlie 22 USA
Save this file in your working directory.
Step 3: Read the Excel file using Pandas
We will start by importing the Pandas library and reading the Excel file into a DataFrame object. A DataFrame is a two-dimensional, size-mutable, and heterogeneous tabular data structure with labeled axes (rows and columns).
1 2 3 4 5 |
import pandas as pd file_name = "sample_data.xlsx" data_frame = pd.read_excel(file_name) print(data_frame) |
This will output the following data:
Name Age Country 0 John 35 USA 1 Alice 28 UK 2 David 28 Australia 3 Charlie 22 USA
Step 4: Sort the column
Now let’s sort the DataFrame based on the “Age” column in ascending order. We will use the sort_values
function provided by Pandas.
1 2 |
sorted_data_frame = data_frame.sort_values(by="Age") print(sorted_data_frame) |
This will output the sorted data:
Name Age Country 3 Charlie 22 USA 1 Alice 28 UK 2 David 28 Australia 0 John 35 USA
Step 5: Save the sorted data to a new Excel file
Finally, let’s save the sorted data into a new Excel file named “sorted_data.xlsx”.
1 2 |
sorted_file_name = "sorted_data.xlsx" sorted_data_frame.to_excel(sorted_file_name, index=False) |
This will create a new Excel file in your working directory with the sorted data.
Full Code
1 2 3 4 5 6 7 8 9 |
import pandas as pd file_name = "sample_data.xlsx" data_frame = pd.read_excel(file_name) sorted_data_frame = data_frame.sort_values(by="Age") sorted_file_name = "sorted_data.xlsx" sorted_data_frame.to_excel(sorted_file_name, index=False) |
Output
Conclusion
In this tutorial, we learned how to sort a column in an Excel file using Python and Pandas. By following these steps, you can easily sort any column in your Excel sheet and save the sorted data into a new file.