In this tutorial, you will learn how to find duplicate strings in a list of strings using Python. Detecting duplicate strings is useful in various programming scenarios like data cleaning, validation, and analysis.
Python’s rich collection of data structures and built-in functions makes it easy to achieve this task efficiently.
1. Prepare the list of strings
First, create a list of strings that includes some duplicate values. This list will be used as input for our Python code to detect duplicate strings.
1 |
string_list = ["apple", "banana", "orange", "grape", "banana", "apple", "pineapple", "orange"] |
2. Find duplicate strings using the collections module
Python’s collections
module provides a very handy Counter
class that counts the occurrence of elements in a list. It can be used to detect duplicate strings effectively.
1 2 |
from collections import Counter string_count = Counter(string_list) |
Now, the string_count
the variable will hold a dictionary-like object having strings as keys and their respective occurrences as values.
3. Filter out duplicate strings
To only keep duplicate strings, we can iterate over the string_count
object and add the duplicate strings to a new list.
1 |
duplicates = [item for item, count in string_count.items() if count > 1] |
In this step, we used a list comprehension to loop through the items()
of the string_count
object and check if the count
of an item is greater than 1, which indicates it’s a duplicate. If a string is a duplicate, it’s added to the duplicates
list.
4. Print the duplicates
Finally, print the list of duplicate strings.
1 |
print("Duplicate Strings:", duplicates) |
Full Code
1 2 3 4 5 6 7 8 9 10 11 12 13 |
from collections import Counter # List of strings string_list = ["apple", "banana", "orange", "grape", "banana", "apple", "pineapple", "orange"] # Count occurrences of each string string_count = Counter(string_list) # Filter out duplicate strings duplicates = [item for item, count in string_count.items() if count > 1] # Print the duplicates print("Duplicate Strings:", duplicates) |
Output
Duplicate Strings: ['apple', 'banana', 'orange']
Conclusion
In this tutorial, we have learned how to find duplicate strings in a list of strings using Python. We used the Counter
class from the collections
module to count the occurrences of each string and then filtered out the duplicate ones.
This approach is concise, and efficient, and makes the code easy to read and understand.