Reading specific columns from an Excel file can be a useful task, especially when working with large data sets.
Python provides several packages to work with Excel files, including pandas
and openpyxl
. In this tutorial, we will focus on using pandas
to read specific columns from an Excel file in Python.
Steps to Read Specific Columns from Excel in Python
Step 1: Create an Excel file
Add the following contents to the Excel file called “filename.xlsx”:
Column Name Data 1 Data 2 Data 3 Data 4 Data 5
Step 2: Install Required Packages
To use the pandas
package, we need to install it first. To do this, we can use the following command in the terminal:
1 |
pip install pandas |
We can also install the openpyxl
package if it is not already installed. To do this, we can use the following command:
1 |
pip install openpyxl |
Step 3: Import Required Libraries
After installing the required packages, we can now import the required libraries in our Python script. We need to import the pandas
package using the following command:
1 |
import pandas as pd |
Step 4: Load Excel File using Pandas
Next, we need to load the Excel file using pandas
. We can do this using the following command:
1 |
df = pd.read_excel('filename.xlsx') |
This will load the entire Excel file into a pandas
DataFrame object.
Step 5: Read Specific Column from DataFrame
Once we have loaded the Excel file into a DataFrame, we can read specific columns from it. We can use the following code to read a specific column:
1 |
column_data = df['Column Name'] |
Replace 'Column Name'
with the name of the column that you want to read. This will extract the data from the specified column and store it in a pandas Series object.
Step 6: Display Column Data
Finally, we can display the data using the following code:
1 |
print(column_data) |
This will display the data from the specified column.
Code Example
Here is an example code that demonstrates how to read a specific column from an Excel file using pandas
.
1 2 3 4 5 6 7 8 9 10 11 |
# Import required libraries import pandas as pd # Load Excel file using Pandas df = pd.read_excel('filename.xlsx') # Read specific column from DataFrame column_data = df['Column Name'] # Display Column Data print(column_data) |
Conclusion
In this tutorial, we learned how to read specific columns from an Excel file using pandas
in Python.
We installed the required packages, imported the required libraries, loaded the Excel file into a DataFrame, read a specific column from the DataFrame, and displayed the data. This can be a useful technique when working with large data sets in Python.