pandas get rows which are NOT in other dataframe

Cover Image for pandas get rows which are NOT in other dataframe
Matheus Mello
Matheus Mello
published a few days ago. updated a few hours ago

Getting Rows in Pandas DataFrame that are NOT in Another DataFrame 😎

So, you have two pandas data frames - df1 and df2 - and you want to find the rows in df1 that are not present in df2. Well, you've come to the right place! In this blog post, we'll explore common issues and provide you with easy solutions to this problem. Let's dive in! 🚀

The Problem and Example Data

To understand the problem better, let's take a look at an example. We have two data frames, df1 and df2, where df2 is a subset of df1. Our goal is to extract the rows from df1 that are not present in df2.

import pandas as pd

df1 = pd.DataFrame(data = {'col1' : [1, 2, 3, 4, 5], 'col2' : [10, 11, 12, 13, 14]}) 
df2 = pd.DataFrame(data = {'col1' : [1, 2, 3], 'col2' : [10, 11, 12]})

print("df1:")
print(df1)

print("\ndf2:")
print(df2)

Here's the output:

df1:
   col1  col2
0     1    10
1     2    11
2     3    12
3     4    13
4     5    14

df2:
   col1  col2
0     1    10
1     2    11
2     3    12

The Solution: Using Pandas' isin() and Boolean Masking

To get the rows in df1 that are not present in df2, we can leverage Pandas' powerful isin() function in combination with Boolean masking. Here's how you can do it:

mask = ~df1.isin(df2).all(axis=1)
result = df1[mask]

print("\nExpected result:")
print(result)

And the expected result will be:

col1  col2
3     4    13
4     5    14

Let's break down the solution:

  1. df1.isin(df2) compares each element in df1 with df2 element-wise, resulting in a boolean DataFrame indicating whether the elements are present in df2.

  2. all(axis=1) checks if all values in a row are True. This operation returns a boolean array with True for rows that are present in df2 and False for rows that are not.

  3. ~ is the negation operator, flipping True to False and vice versa. We use it to obtain a boolean mask that is True for rows not present in df2 and False for rows that are present.

  4. Finally, we apply this boolean mask to df1 using df1[mask], which gives us the desired result - the rows that are not present in df2.

Conclusion and Your Turn! 😉

That's it! You've learned an easy and efficient way to extract rows in Pandas DataFrame that are not present in another DataFrame. Feel free to use this technique in your projects!

Now, it's time for you to put your newfound knowledge into action. Try applying this solution to your own data frames and see how it works. Share your experience and any other useful tips in the comments section below! Let's keep learning together! 🌟


More Stories

Cover Image for How can I echo a newline in a batch file?

How can I echo a newline in a batch file?

updated a few hours ago
batch-filenewlinewindows

🔥 💻 🆒 Title: "Getting a Fresh Start: How to Echo a Newline in a Batch File" Introduction: Hey there, tech enthusiasts! Have you ever found yourself in a sticky situation with your batch file output? We've got your back! In this exciting blog post, we

Matheus Mello
Matheus Mello
Cover Image for How do I run Redis on Windows?

How do I run Redis on Windows?

updated a few hours ago
rediswindows

# Running Redis on Windows: Easy Solutions for Redis Enthusiasts! 🚀 Redis is a powerful and popular in-memory data structure store that offers blazing-fast performance and versatility. However, if you're a Windows user, you might have stumbled upon the c

Matheus Mello
Matheus Mello
Cover Image for Best way to strip punctuation from a string

Best way to strip punctuation from a string

updated a few hours ago
punctuationpythonstring

# The Art of Stripping Punctuation: Simplifying Your Strings 💥✂️ Are you tired of dealing with pesky punctuation marks that cause chaos in your strings? Have no fear, for we have a solution that will strip those buggers away and leave your texts clean an

Matheus Mello
Matheus Mello
Cover Image for Purge or recreate a Ruby on Rails database

Purge or recreate a Ruby on Rails database

updated a few hours ago
rakeruby-on-railsruby-on-rails-3

# Purge or Recreate a Ruby on Rails Database: A Simple Guide 🚀 So, you have a Ruby on Rails database that's full of data, and you're now considering deleting everything and starting from scratch. Should you purge the database or recreate it? 🤔 Well, my

Matheus Mello
Matheus Mello