Counting unique values in a column in pandas dataframe like in Qlik?

Cover Image for Counting unique values in a column in pandas dataframe like in Qlik?
Matheus Mello
Matheus Mello
published a few days ago. updated a few hours ago

Counting Unique Values in a Pandas DataFrame like in Qlik

Hey there tech enthusiasts! 👋 Are you struggling to count unique values in a column of a pandas DataFrame, just like in Qlik? Look no further! In this blog post, we will explore a common issue that many developers face and provide easy solutions using Python's pandas library. So, let's dive right in! 💪

The Scenario

Imagine you have a pandas DataFrame like this:

df = pd.DataFrame({
    'hID': [101, 102, 103, 101, 102, 104, 105, 101],
    'dID': [10, 11, 12, 10, 11, 10, 12, 10],
    'uID': ['James', 'Henry', 'Abe', 'James', 'Henry', 'Brian', 'Claude', 'James'],
    'mID': ['A', 'B', 'A', 'B', 'A', 'A', 'A', 'C']
})

Qlik users are familiar with the count(distinct hID) function, which would return a count of 5 for unique hID values in Qlik. But how can you achieve the same result using a Python pandas DataFrame or a numpy array? Let's find out! 💡

Counting Unique Values in pandas

Solution 1: Using the nunique() Function

The simplest way to count unique values in a pandas DataFrame column is by using the nunique() function. The nunique() function returns the number of unique elements in a column.

To count unique hID values in the DataFrame df, you can simply do:

unique_hID_count = df['hID'].nunique()
print(unique_hID_count)

Output:

5

Isn't that super easy? You get the same count of 5, just like in Qlik! 🎉

Solution 2: Using the drop_duplicates() Function

Another approach to counting unique values is by using the drop_duplicates() function. This function removes duplicate rows from the DataFrame, and you can then count the remaining rows.

To count unique hID values using this method, follow these steps:

# Drop duplicate rows based on 'hID'
unique_df = df.drop_duplicates(subset='hID')

# Count the number of remaining rows
unique_hID_count = unique_df.shape[0]
print(unique_hID_count)

Output:

5

Amazing, right? This method achieves the same count of 5 as well! 🎉

Solution 3: Utilizing a numpy array

If you prefer working with numpy arrays, you can convert the pandas column to a numpy array and then utilize numpy's unique() function.

To count unique hID values using a numpy array, you can try this code:

import numpy as np

unique_hID_count = np.unique(df['hID']).size
print(unique_hID_count)

Output:

5

It's as simple as that! You get the same count of 5 using numpy too! 🎉

Counting Total Values

Now that we've covered counting unique values, let's talk about counting total occurrences of a value, similar to count(hID) in Qlik.

To count the total occurrences of hID, you can use the value_counts() function:

hID_count = df['hID'].value_counts().sum()
print(hID_count)

Output:

8

Voila! You get the desired count of 8, just like in Qlik! 🎉

Call to Action

You've made it to the end! 🙌 We hope this guide helped you understand how to count unique values in a pandas DataFrame, just like in Qlik. If you found it helpful, why not share it with your fellow developers and spread the knowledge? 🔗 Remember, sharing is caring!

Got any more pandas or general tech-related questions? Feel free to leave a comment below, and let's engage in a tech talk! 💬🔥


More Stories

Cover Image for How can I echo a newline in a batch file?

How can I echo a newline in a batch file?

updated a few hours ago
batch-filenewlinewindows

🔥 💻 🆒 Title: "Getting a Fresh Start: How to Echo a Newline in a Batch File" Introduction: Hey there, tech enthusiasts! Have you ever found yourself in a sticky situation with your batch file output? We've got your back! In this exciting blog post, we

Matheus Mello
Matheus Mello
Cover Image for How do I run Redis on Windows?

How do I run Redis on Windows?

updated a few hours ago
rediswindows

# Running Redis on Windows: Easy Solutions for Redis Enthusiasts! 🚀 Redis is a powerful and popular in-memory data structure store that offers blazing-fast performance and versatility. However, if you're a Windows user, you might have stumbled upon the c

Matheus Mello
Matheus Mello
Cover Image for Best way to strip punctuation from a string

Best way to strip punctuation from a string

updated a few hours ago
punctuationpythonstring

# The Art of Stripping Punctuation: Simplifying Your Strings 💥✂️ Are you tired of dealing with pesky punctuation marks that cause chaos in your strings? Have no fear, for we have a solution that will strip those buggers away and leave your texts clean an

Matheus Mello
Matheus Mello
Cover Image for Purge or recreate a Ruby on Rails database

Purge or recreate a Ruby on Rails database

updated a few hours ago
rakeruby-on-railsruby-on-rails-3

# Purge or Recreate a Ruby on Rails Database: A Simple Guide 🚀 So, you have a Ruby on Rails database that's full of data, and you're now considering deleting everything and starting from scratch. Should you purge the database or recreate it? 🤔 Well, my

Matheus Mello
Matheus Mello