How to filter Pandas dataframe using "in" and "not in" like in SQL

Cover Image for How to filter Pandas dataframe using "in" and "not in" like in SQL
Matheus Mello
Matheus Mello
published a few days ago. updated a few hours ago

Filtering Pandas DataFrame Using 'in' and 'not in' like in SQL: A Quick Guide 👩‍💻

So, you're working with a Pandas DataFrame and wondering how to filter it using the 'in' and 'not in' operators, just like you would in SQL. Look no further! In this guide, we'll address this common issue and provide you with easy solutions. 🚀

The Scenario 📚

Before we dive into the solutions, let's set the stage with a scenario. Assume you have a DataFrame called df with a column named 'country'. You also have a list called countries_to_keep that contains the countries you want to retain in your filtered DataFrame.

Let's take a look at the sample code provided in the context:

df = pd.DataFrame({'country': ['US', 'UK', 'Germany', 'China']})
countries_to_keep = ['UK', 'China']

# pseudo-code:
df[df['country'] not in countries_to_keep]

The Problem 🤔

The pseudo-code provided above might seem like the intuitive way to filter the DataFrame using 'not in' in Python. However, it's not correct syntax for filtering based on multiple values. You'll likely encounter an error if you try to execute the code.

Solution 1: Using the isin() Method ✅

Pandas provides an elegant solution for filtering DataFrames using the 'in' operator. You can achieve it by using the isin() method. Here's how you can apply it to your specific scenario:

df[df['country'].isin(countries_to_keep)]

By using the isin() method in the above code snippet, you can filter the DataFrame based on whether the 'country' values are present in the 'countries_to_keep' list. It will return a filtered DataFrame with only the rows where the 'country' falls within the desired countries list.

Solution 2: Using ~ Operator for 'not in' Filtering 🚫

For filtering based on the 'not in' operator, Pandas allows us to use the tilde (~) operator. Here's how you can implement it:

df[~df['country'].isin(countries_to_keep)]

In the code snippet above, the ~ operator negates the result of df['country'].isin(countries_to_keep). This will return a filtered DataFrame with rows where the 'country' values are not present in the 'countries_to_keep' list.

A Better Way? 😄

In the quoted context, an alternative approach was suggested using DataFrame merging. While this approach might work, it can sometimes feel a bit convoluted and not as straightforward as the previous solutions we discussed.

Conclusion and Call to Action 🎉

You've learned two simple and effective ways to filter a Pandas DataFrame using 'in' and 'not in' operators, just like in SQL. Now you can slice and dice your data based on the values you want to include or exclude.

Try these solutions for yourself and let us know which approach works best in your case. If you have any other tips or tricks to share, or if you encountered any issues while implementing these solutions, drop us a comment below. We'd love to hear from you! 👇

Happy coding! 💻✨


More Stories

Cover Image for How can I echo a newline in a batch file?

How can I echo a newline in a batch file?

updated a few hours ago
batch-filenewlinewindows

🔥 💻 🆒 Title: "Getting a Fresh Start: How to Echo a Newline in a Batch File" Introduction: Hey there, tech enthusiasts! Have you ever found yourself in a sticky situation with your batch file output? We've got your back! In this exciting blog post, we

Matheus Mello
Matheus Mello
Cover Image for How do I run Redis on Windows?

How do I run Redis on Windows?

updated a few hours ago
rediswindows

# Running Redis on Windows: Easy Solutions for Redis Enthusiasts! 🚀 Redis is a powerful and popular in-memory data structure store that offers blazing-fast performance and versatility. However, if you're a Windows user, you might have stumbled upon the c

Matheus Mello
Matheus Mello
Cover Image for Best way to strip punctuation from a string

Best way to strip punctuation from a string

updated a few hours ago
punctuationpythonstring

# The Art of Stripping Punctuation: Simplifying Your Strings 💥✂️ Are you tired of dealing with pesky punctuation marks that cause chaos in your strings? Have no fear, for we have a solution that will strip those buggers away and leave your texts clean an

Matheus Mello
Matheus Mello
Cover Image for Purge or recreate a Ruby on Rails database

Purge or recreate a Ruby on Rails database

updated a few hours ago
rakeruby-on-railsruby-on-rails-3

# Purge or Recreate a Ruby on Rails Database: A Simple Guide 🚀 So, you have a Ruby on Rails database that's full of data, and you're now considering deleting everything and starting from scratch. Should you purge the database or recreate it? 🤔 Well, my

Matheus Mello
Matheus Mello