Pandas DataFrame Filter Method

PythonPythonBeginner
Practice Now

Introduction

In this lab, we will learn how to use the filter() method in the Pandas DataFrame. The filter() method allows us to subset the rows or columns of a DataFrame based on specified index labels. It is important to note that this method filters the DataFrame based on the labels of the index, not the content of the DataFrame.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL python(("`Python`")) -.-> python/BasicConceptsGroup(["`Basic Concepts`"]) python(("`Python`")) -.-> python/FileHandlingGroup(["`File Handling`"]) pandas(("`Pandas`")) -.-> pandas/DataSelectionGroup(["`Data Selection`"]) python(("`Python`")) -.-> python/DataStructuresGroup(["`Data Structures`"]) python(("`Python`")) -.-> python/ModulesandPackagesGroup(["`Modules and Packages`"]) python(("`Python`")) -.-> python/DataScienceandMachineLearningGroup(["`Data Science and Machine Learning`"]) python(("`Python`")) -.-> python/FunctionsGroup(["`Functions`"]) python/BasicConceptsGroup -.-> python/comments("`Comments`") python/FileHandlingGroup -.-> python/with_statement("`Using with Statement`") pandas/DataSelectionGroup -.-> pandas/select_columns("`Select Columns`") python/DataStructuresGroup -.-> python/lists("`Lists`") python/DataStructuresGroup -.-> python/tuples("`Tuples`") python/DataStructuresGroup -.-> python/dictionaries("`Dictionaries`") python/ModulesandPackagesGroup -.-> python/importing_modules("`Importing Modules`") python/DataScienceandMachineLearningGroup -.-> python/numerical_computing("`Numerical Computing`") python/DataScienceandMachineLearningGroup -.-> python/data_analysis("`Data Analysis`") python/FunctionsGroup -.-> python/build_in_functions("`Build-in Functions`") subgraph Lab Skills python/comments -.-> lab-68622{{"`Pandas DataFrame Filter Method`"}} python/with_statement -.-> lab-68622{{"`Pandas DataFrame Filter Method`"}} pandas/select_columns -.-> lab-68622{{"`Pandas DataFrame Filter Method`"}} python/lists -.-> lab-68622{{"`Pandas DataFrame Filter Method`"}} python/tuples -.-> lab-68622{{"`Pandas DataFrame Filter Method`"}} python/dictionaries -.-> lab-68622{{"`Pandas DataFrame Filter Method`"}} python/importing_modules -.-> lab-68622{{"`Pandas DataFrame Filter Method`"}} python/numerical_computing -.-> lab-68622{{"`Pandas DataFrame Filter Method`"}} python/data_analysis -.-> lab-68622{{"`Pandas DataFrame Filter Method`"}} python/build_in_functions -.-> lab-68622{{"`Pandas DataFrame Filter Method`"}} end

Create a DataFrame

First, let's create a sample DataFrame to work with.

#import pandas as pd
import pandas as pd

#creating DataFrame
df=pd.DataFrame({
    "Name":["Navya","Vindya","Sinchana","Amrutha","Akshatha"],
    "Age":[25,24,25,25,26],
    "Education":["M.Tech","M.Tech","M.Tech","Ph.d","Ph.d"],
    "YOP":[2019,2020,2018,None,None]},
    index=["Group_1", "Group_1","Group_1","Group_2","Group_2"])

#printing DataFrame
print("-------DataFrame is----------")
print(df)

Filter by column names using the filter() method

We can use the items parameter of the filter() method to filter the DataFrame by certain columns.

#filter by column names
filtered_df = df.filter(items=["Name","Education"])

#printing filtered DataFrame
print("---------Filtered DataFrame---------")
print(filtered_df)

Filter by row names using the filter() method

We can use the like parameter of the filter() method to filter the DataFrame by certain rows.

#filter by row names
filtered_df = df.filter(like='Group_2', axis=0)

#printing filtered DataFrame
print("---------Filtered DataFrame---------")
print(filtered_df)

Filter by column names with the regex parameter

We can use the regex parameter of the filter() method to filter the DataFrame by certain columns based on a regular expression.

#filter by column names with regex
filtered_df = df.filter(regex ='[g]')

#printing filtered DataFrame
print("---------Filtered DataFrame---------")
print(filtered_df)

Summary

In this lab, we learned how to use the filter() method in the Pandas DataFrame. We covered how to filter the DataFrame by column names and row names, as well as how to filter based on a regular expression. This method is useful for subsetting the DataFrame based on specific index labels.

Other Python Tutorials you may like