This tutorial will guide you on how to filter a CSV (Comma Separated Values) file using Python. CSV is a simple file format used to store tabular data, such as a spreadsheet or database.
Python provides several ways to download HTML content from web pages. We will be showing you how to extract and filter data from CSV files using Python’s built-in csv module and pandas library.
Step 1: Importing Necessary Libraries
Before we begin, you need to install and import the necessary Python libraries. Run the following code to install pandas.
1 |
pip install pandas |
Then import the necessary libraries using the code below:
1 2 |
import csv import pandas as pd |
Step 2: Reading a CSV File
After importing the needed libraries, the next step is to read our CSV file.
1 |
data = pd.read_csv("data.csv") |
Step 3: Inspecting the CSV File
Now that we have our CSV file loaded, we can take a look at the data to determine what we might want to filter.
1 |
print(data.head()) |
Step 4: Filtering a CSV File
After identifying the column of interest, apply the filter function to select the data you want. In this example, we will filter out rows where the Age is greater than 30.
1 2 |
filtered_data = data[data['Age'] > 30] print(filtered_data.head()) |
Before the conclusion, below is the full code for filtering a CSV file in Python.
Full Code
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import csv import pandas as pd # Read the CSV file data = pd.read_csv("data.csv") # Print the first five rows print(data.head()) # Filter and print the first five rows where 'Age' is greater than 30 filtered_data = data[data['Age'] > 30] print() print(filtered_data.head()) |
Output:
Name Age Location 0 Alice 28 New York 1 Bob 35 Los Angeles 2 Charlie 42 Chicago 3 David 25 San Francisco 4 Eve 31 Boston Name Age Location 1 Bob 35 Los Angeles 2 Charlie 42 Chicago 4 Eve 31 Boston 5 Frank 45 Miami
Conclusion
That’s it! You have now learned how to filter a CSV file in Python. This is a very useful skill when you are dealing with larger datasets and you are interested only in specific records.
Don’t forget to save your data to a new file if you want to keep these changes. Knowing how to filter a CSV file in Python can significantly speed up your data analysis process and allows you to focus on tasks that add value.