Python is a widely used programming language due to its user-friendly syntax, feature-rich standard library, and extensive collection of third-party libraries to make work easier and more efficient.
One feature that Python especially excels at is filtering data, which is an essential skill for anybody who works with a lot of data and requires clean datasets.
In this tutorial, we will explore how to filter data in Python using various techniques including the in-built function, lambda function, filtering based on conditions, and filtering using Pandas DataFrame.
Step 1: Basic Data Filtering Using Python’s in-built function ‘filter’
Python provides a built-in function filter()
to filter out data. The ‘filter()’ function takes in two parameters, a function and a list.
1 2 3 4 5 6 |
def positive(x): return x > 0 nums = [40, -10, 20, -30, 50] filtered = filter(positive, nums) print(list(filtered)) |
[40, 20, 50]
Step 2: Advanced Data Filtering Using Python’s Lambda Functions
We can also use lambda functions to filter out data. A lambda function is a short, anonymous function that has no name.
1 2 3 |
nums = [40, -10, 20, -30, 50] filtered = filter(lambda x: x > 0, nums) print(list(filtered)) |
[40, 20, 50]
Step 3: Filter data based on conditions
We can also use list comprehensions to interview and filter.
1 2 3 |
nums = [40, -10, 20, -30, 50] filtered = [num for num in nums if num > 0] print(filtered) |
[40, 20, 50]
Step 4: Filtering Data Using Pandas DataFrames
Pandas is a powerful data manipulation library in Python. We can use it to filter data in a more complex and flexible way.
1 2 3 4 5 6 7 8 9 10 11 |
import pandas as pd data = { "Name": ["John", "Anna", "Peter", "Linda"], "Age": [28, 24, 35, 32], "City": ["New York", "Paris", "Berlin", "London"] } df = pd.DataFrame(data) filtered_df = df[df["Age"] > 30] print(filtered_df) |
Name Age City 2 Peter 35 Berlin 3 Linda 32 London
Conclusion
Filtering data is a key part of data preparation and preprocessing. This tutorial provided an introduction to how to filter data in Python using several techniques. By practicing these techniques, Python users can make their workflows more efficient and effective.