How do I count the NaN values in a column in pandas DataFrame?
Counting NaN Values in a Pandas DataFrame using Python 🐍
So you have a large dataset with missing values (NaN) and you want to know how many NaN values are present in each column of your pandas DataFrame. Don't worry, we've got you covered! In this blog post, we will show you some easy solutions to tackle this problem.
The Problem: Counting NaN Values
Let's say you have a pandas DataFrame called df
and you want to count the number of NaN values in each column. Here's an example to give you some context:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, np.nan, 4],
'B': [np.nan, 6, 7, 8],
'C': [9, 10, np.nan, 12]
})
The DataFrame df
looks like this:
A B C
0 1.0 NaN 9.0
1 2.0 6.0 10.0
2 NaN 7.0 NaN
3 4.0 8.0 12.0
Now, let's dive into the solutions!
Solution 1: Using the .isnull()
and .sum()
methods
The easiest way to count NaN values in each column is by using the combination of the .isnull()
and .sum()
methods. Here's how you can do it:
nan_counts = df.isnull().sum()
This will give you a pandas Series object with the column names as the index and the count of NaN values as the values. For our example DataFrame, nan_counts
would look like this:
A 1
B 1
C 2
dtype: int64
Solution 2: Using the .isna()
and .sum()
methods
Alternatively, you can use the .isna()
method instead of .isnull()
to achieve the same result:
nan_counts = df.isna().sum()
The output will be identical to Solution 1.
Solution 3: Using the .apply()
method
If you prefer a more versatile approach, you can use the .apply()
method along with a lambda function to count the NaN values column-wise. Here's an example:
nan_counts = df.apply(lambda x: x.isnull().sum())
This will give you the same output as Solutions 1 and 2.
Choose Your Solution
Now that you know three different ways to count NaN values in a pandas DataFrame, it's up to you to choose the one that suits your needs the best. Feel free to experiment with all of them and see which one you find most intuitive or efficient.
Conclusion
Counting NaN values in a pandas DataFrame doesn't have to be complicated. With the help of the .isnull()
, .isna()
, .sum()
, and .apply()
methods, you can easily determine the number of missing values in each column.
We hope this guide has been helpful to you! If you have any questions or suggestions, feel free to let us know in the comments section below.
Happy coding! 💻
➡️ Do you want to learn more about pandas and data analysis in Python? Check out our other blog posts for more tips and tricks!