How To Extract Column From Excel In Python

In this tutorial, we will learn how to extract a specific column from an Excel file using Python. Python provides various libraries such as pandas and openpyxl that enable us to work seamlessly with Excel files.

In this tutorial, we will work with the widely popular pandas library to demonstrate how to extract a column from an Excel file.

Before diving into the steps, please ensure you have Python installed on your machine, as well as the pandas library. If you do not have the pandas library installed, you can install it using pip by running the following command:
pip install pandas

Step 1: Read the Excel File

After installing the pandas library, the first step is to read the Excel file. We will use the read_excel() function provided by pandas to read an Excel file. Let’s assume we have an Excel file named sample_data.xlsx with the following content:

Name         Age  Country
John          25  USA
Sophia        32  UK
Henry         28  Canada
Samantha      22  Australia
Emma          29  France
Matthew       35  Mexico

Now, let’s import the pandas library and read the data in this Excel file:

Step 2: Extract the Desired Column

Let’s say we want to extract the Name column from the Excel file. We can do this by simply selecting this column from the DataFrame that we have created from the Excel file. Here is how:

Step 3: Print the Desired Column

Now that we have extracted the desired column, let’s print it to see if the extraction is correct:

After running the above code, you will get the following output:

0       John
1     Sophia
2      Henry
3    Samantha
4       Emma
5    Matthew
Name: Name, dtype: object

Full code:

Conclusion

In this tutorial, we learned how to extract a specific column from an Excel file using Python and the pandas library.

With just a few lines of code, we can easily extract any desired column by the column name from an Excel file.

Not only does this process save time compared to manual work in Excel, but it also opens possibilities for automation and further analysis using Python.