How To Count Unique Values In Python Dataframe

In this tutorial, we will learn how to count unique values in a Python DataFrame using the powerful data manipulation library, Pandas. Counting unique values in a DataFrame is a common operation, often required during data analysis and manipulation tasks such as outlier detection and data aggregation.

To follow along with this tutorial, you should have a basic understanding of Python and the Pandas library. If you’re new to Pandas, consider checking out the official 10 Minutes to Pandas guide for a quick introduction.

Step 1: Import Libraries and Create Sample Data

First, let’s import the necessary libraries and create a sample DataFrame with some data to work with. In this case, we will create a DataFrame containing information about employees and their departments:

       Name Department
0     Alice         HR
1       Bob         IT
2  Charlie         HR
3    David         IT
4      Eve         HR
5    Frank         IT
6    Grace         HR
7    Heidi         IT
8     Ivan         HR
9     Judy         IT

Step 2: Count Unique Values with Pandas

Now that we have our sample DataFrame, we can start counting unique values. To do this, we will use the Pandas nunique() method, which returns the number of unique elements in a DataFrame or Series object.

Let’s count the unique values in the Department column:

Number of unique departments: 2

As we can see, there are two unique values (HR and IT) in the Department column.

Step 3: Counting Unique Values for Each Column

What if we want to count unique values for each column in our DataFrame? We can achieve this by simply using the nunique() method on the entire DataFrame, as shown below:

Unique values in each column:
Name          10
Department     2
dtype: int64

This output tells us that there are 10 unique names and 2 unique departments in our DataFrame.

Step 4: Getting a List of Unique Values

In some cases, we might want to get a list of the unique values themselves, rather than just the count. To do this, we can use the unique() method, as shown below:

Unique departments:
['HR' 'IT']

With this, we have successfully extracted the list of unique values in the Department column.

Full Code

Conclusion

In this tutorial, we have learned how to count unique values in a Python DataFrame using Pandas. We covered the usage of the nunique() function to count unique values in a DataFrame column, and the unique() function to get a list of the unique values themselves. With these methods, you can easily perform data analysis and aggregation tasks on your DataFrames. Happy coding!