Pandas DataFrame Fillna Method

PythonPythonBeginner
Practice Now

Introduction

In this lab, you will learn how to use the fillna() method in the Pandas library. The fillna() method allows you to fill missing or NaN (Not a Number) values in a DataFrame with specified values or using a specified method.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.

Importing the necessary libraries

Let's start by importing the Pandas library:

import pandas as pd

Create a DataFrame with missing values

Next, let's create a DataFrame with some missing values:

df = pd.DataFrame([[2, pd.nan, 0], [pd.nan, pd.nan, 5], [pd.nan, 3, pd.nan]], columns=list('ABC'))
print("The DataFrame is:")
print(df)

Fill missing values with a specified value

We can use the fillna() method to replace all missing values with a specified value. For example, let's replace all missing values in the DataFrame with the value 2:

print("Filling NaN values:")
print(df.fillna(2))

Fill missing values using forward fill method

Instead of using a single value to fill missing values, we can propagate non-null values forward using the forward fill method (ffill). This method fills missing values with the last observed non-null value in the same column.

print("Filling NaN values using forward fill method:")
print(df.fillna(method='ffill'))

Fill missing values using forward fill method along the columns

We can also use the forward fill method along the columns by setting the axis parameter to 1.

print("Filling NaN values using forward fill method along the columns:")
print(df.fillna(method='ffill', axis=1))

Fill missing values using backward fill method

Similarly, we can use the backward fill method (bfill) to propagate non-null values backward. This method fills missing values with the next observed non-null value in the same column.

print("Filling NaN values using backward fill method:")
print(df.fillna(method='bfill'))

Replace specific columns' missing values with specified values

We can replace specific columns' missing values with different specified values. For example, let's replace missing values in columns 'A', 'B', 'C', and 'D' with values 0, 1, 2, and 3, respectively.

print("Filling NaN values in specific columns:")
new_values = {'A': 0, 'B': 1, 'C': 2, 'D': 3}
print(df.fillna(value=new_values))

Limit the number of consecutive NaN values to fill

We can also limit the number of consecutive NaN values to fill using the limit parameter. For example, let's limit the forward filling to only one consecutive NaN value.

print("Filling NaN values with a limit:")
print(df.fillna(method='ffill', limit=1))

Summary

In this lab, you learned how to use the fillna() method in the Pandas library to fill missing or NaN values in a DataFrame. You explored various ways to fill missing values, including using a specified value, forward filling, backward filling, and limiting the number of consecutive NaN values to fill. Fillna is a useful method for handling missing data and ensuring the integrity of your DataFrame.

Other Python Tutorials you may like