Counting unique values in a column in pandas dataframe like in Qlik?

Matheus Mello
Matheus Mello
September 2, 2023
Cover Image for Counting unique values in a column in pandas dataframe like in Qlik?

Counting Unique Values in a Pandas DataFrame like in Qlik

Hey there tech enthusiasts! 👋 Are you struggling to count unique values in a column of a pandas DataFrame, just like in Qlik? Look no further! In this blog post, we will explore a common issue that many developers face and provide easy solutions using Python's pandas library. So, let's dive right in! 💪

The Scenario

Imagine you have a pandas DataFrame like this:

df = pd.DataFrame({
    'hID': [101, 102, 103, 101, 102, 104, 105, 101],
    'dID': [10, 11, 12, 10, 11, 10, 12, 10],
    'uID': ['James', 'Henry', 'Abe', 'James', 'Henry', 'Brian', 'Claude', 'James'],
    'mID': ['A', 'B', 'A', 'B', 'A', 'A', 'A', 'C']
})

Qlik users are familiar with the count(distinct hID) function, which would return a count of 5 for unique hID values in Qlik. But how can you achieve the same result using a Python pandas DataFrame or a numpy array? Let's find out! 💡

Counting Unique Values in pandas

Solution 1: Using the nunique() Function

The simplest way to count unique values in a pandas DataFrame column is by using the nunique() function. The nunique() function returns the number of unique elements in a column.

To count unique hID values in the DataFrame df, you can simply do:

unique_hID_count = df['hID'].nunique()
print(unique_hID_count)

Output:

5

Isn't that super easy? You get the same count of 5, just like in Qlik! 🎉

Solution 2: Using the drop_duplicates() Function

Another approach to counting unique values is by using the drop_duplicates() function. This function removes duplicate rows from the DataFrame, and you can then count the remaining rows.

To count unique hID values using this method, follow these steps:

# Drop duplicate rows based on 'hID'
unique_df = df.drop_duplicates(subset='hID')

# Count the number of remaining rows
unique_hID_count = unique_df.shape[0]
print(unique_hID_count)

Output:

5

Amazing, right? This method achieves the same count of 5 as well! 🎉

Solution 3: Utilizing a numpy array

If you prefer working with numpy arrays, you can convert the pandas column to a numpy array and then utilize numpy's unique() function.

To count unique hID values using a numpy array, you can try this code:

import numpy as np

unique_hID_count = np.unique(df['hID']).size
print(unique_hID_count)

Output:

5

It's as simple as that! You get the same count of 5 using numpy too! 🎉

Counting Total Values

Now that we've covered counting unique values, let's talk about counting total occurrences of a value, similar to count(hID) in Qlik.

To count the total occurrences of hID, you can use the value_counts() function:

hID_count = df['hID'].value_counts().sum()
print(hID_count)

Output:

8

Voila! You get the desired count of 8, just like in Qlik! 🎉

Call to Action

You've made it to the end! 🙌 We hope this guide helped you understand how to count unique values in a pandas DataFrame, just like in Qlik. If you found it helpful, why not share it with your fellow developers and spread the knowledge? 🔗 Remember, sharing is caring!

Got any more pandas or general tech-related questions? Feel free to leave a comment below, and let's engage in a tech talk! 💬🔥

Take Your Tech Career to the Next Level

Our application tracking tool helps you manage your job search effectively. Stay organized, track your progress, and land your dream tech job faster.

Your Product
Product promotion

Share this article

More Articles You Might Like

Latest Articles

Cover Image for How can I echo a newline in a batch file?
batch-filenewlinewindows

How can I echo a newline in a batch file?

Published on March 20, 2060

🔥 💻 🆒 Title: "Getting a Fresh Start: How to Echo a Newline in a Batch File" Introduction: Hey there, tech enthusiasts! Have you ever found yourself in a sticky situation with your batch file output? We've got your back! In this exciting blog post, we

Cover Image for How do I run Redis on Windows?
rediswindows

How do I run Redis on Windows?

Published on March 19, 2060

# Running Redis on Windows: Easy Solutions for Redis Enthusiasts! 🚀 Redis is a powerful and popular in-memory data structure store that offers blazing-fast performance and versatility. However, if you're a Windows user, you might have stumbled upon the c

Cover Image for Best way to strip punctuation from a string
punctuationpythonstring

Best way to strip punctuation from a string

Published on November 1, 2057

# The Art of Stripping Punctuation: Simplifying Your Strings 💥✂️ Are you tired of dealing with pesky punctuation marks that cause chaos in your strings? Have no fear, for we have a solution that will strip those buggers away and leave your texts clean an

Cover Image for Purge or recreate a Ruby on Rails database
rakeruby-on-railsruby-on-rails-3

Purge or recreate a Ruby on Rails database

Published on November 27, 2032

# Purge or Recreate a Ruby on Rails Database: A Simple Guide 🚀 So, you have a Ruby on Rails database that's full of data, and you're now considering deleting everything and starting from scratch. Should you purge the database or recreate it? 🤔 Well, my