Pandas DataFrame Isnull Method

PythonPythonBeginner
Practice Now

Introduction

In this lab, we will learn how to use the DataFrame.isnull() method in pandas. This method is used to detect missing values in a DataFrame. By applying this method to a DataFrame, it returns a DataFrame of boolean values, where True indicates that the element is a null value and False indicates that the element is not a null value. It's important to note that this method does not consider empty strings or numpy.inf as null values.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.

Create a DataFrame

Let's start by creating a DataFrame with some missing values. We will use the pd.DataFrame function from pandas and the np.nan constant from numpy.

## Importing pandas as pd
import pandas as pd
## Importing numpy as np
import numpy as np

## Creating the DataFrame
df = pd.DataFrame([(0.0, np.nan, -1.0, 1.0), (np.nan, 2.0, np.nan, np.nan), (2.0, 3.0, np.nan, 9.0)], columns=list('abcd'))

This will create a DataFrame with four columns ('a', 'b', 'c', 'd') and three rows. The DataFrame contains missing values represented by np.nan.

Detect Missing Values

Now, we will use the DataFrame.isnull() method to detect the missing values in the DataFrame.

## Detecting missing values in the DataFrame
missing_values = df.isnull()

## Printing the DataFrame with missing values
print(missing_values)

This will return a DataFrame consisting of boolean values for each element in the original DataFrame. The True values indicate that the corresponding element is a missing value (np.nan), and False values indicate that the element is not a missing value.

Consider Empty Strings as Missing Values

By default, the DataFrame.isnull() method does not consider empty strings as missing values. If you want to consider empty strings as missing values, you can replace them with np.nan before using the method.

## Replacing empty strings with np.nan
df = df.replace('', np.nan)

## Detecting missing values in the DataFrame
missing_values = df.isnull()

## Printing the DataFrame with missing values
print(missing_values)

This will now consider empty strings as missing values and return a DataFrame with boolean values indicating the presence of missing values.

Summary

In this lab, we learned how to use the DataFrame.isnull() method in pandas to detect missing values in a DataFrame. We saw that it returns a DataFrame of boolean values, where True indicates a missing value and False indicates a non-missing value. We also saw that by default, empty strings are not considered as missing values, but they can be treated as missing values by replacing them with np.nan before using the method. This method is useful for data cleaning and preprocessing tasks in data analysis and machine learning projects.

Other Python Tutorials you may like