Pandas DataFrame Median Method

PythonPythonBeginner
Practice Now

Introduction

In this lab, we will learn how to use the median() method in the Pandas library to calculate the median of values in a DataFrame. The median() method allows us to find the middle value in a dataset, providing a measure of central tendency.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL pandas(("`Pandas`")) -.-> pandas/DataAnalysisGroup(["`Data Analysis`"]) python(("`Python`")) -.-> python/BasicConceptsGroup(["`Basic Concepts`"]) python(("`Python`")) -.-> python/DataStructuresGroup(["`Data Structures`"]) python(("`Python`")) -.-> python/ModulesandPackagesGroup(["`Modules and Packages`"]) python(("`Python`")) -.-> python/DataScienceandMachineLearningGroup(["`Data Science and Machine Learning`"]) python(("`Python`")) -.-> python/FunctionsGroup(["`Functions`"]) pandas/DataAnalysisGroup -.-> pandas/basic_statistics("`Basic Statistics`") python/BasicConceptsGroup -.-> python/booleans("`Booleans`") python/DataStructuresGroup -.-> python/lists("`Lists`") python/DataStructuresGroup -.-> python/tuples("`Tuples`") python/DataStructuresGroup -.-> python/dictionaries("`Dictionaries`") python/ModulesandPackagesGroup -.-> python/importing_modules("`Importing Modules`") python/DataScienceandMachineLearningGroup -.-> python/numerical_computing("`Numerical Computing`") python/DataScienceandMachineLearningGroup -.-> python/data_analysis("`Data Analysis`") python/FunctionsGroup -.-> python/build_in_functions("`Build-in Functions`") subgraph Lab Skills pandas/basic_statistics -.-> lab-68662{{"`Pandas DataFrame Median Method`"}} python/booleans -.-> lab-68662{{"`Pandas DataFrame Median Method`"}} python/lists -.-> lab-68662{{"`Pandas DataFrame Median Method`"}} python/tuples -.-> lab-68662{{"`Pandas DataFrame Median Method`"}} python/dictionaries -.-> lab-68662{{"`Pandas DataFrame Median Method`"}} python/importing_modules -.-> lab-68662{{"`Pandas DataFrame Median Method`"}} python/numerical_computing -.-> lab-68662{{"`Pandas DataFrame Median Method`"}} python/data_analysis -.-> lab-68662{{"`Pandas DataFrame Median Method`"}} python/build_in_functions -.-> lab-68662{{"`Pandas DataFrame Median Method`"}} end

Import the pandas library

First, we need to import the 'pandas' library, which is commonly used for data manipulation and analysis.

import pandas as pd

Create a DataFrame

Next, we will create a DataFrame object using the pd.DataFrame() constructor. This will allow us to store and manipulate our data.

df = pd.DataFrame({"A": [0, 52, 78], "B": [77, 45, 96], "C": [16, 23, 135], "D": [17, 22, 56]})
print("------The DataFrame is------")
print(df)

Calculate the median along the index axis

To calculate the median along the index axis of the DataFrame, we can use the median() method with the axis parameter set to 0.

print("---------------------------")
print(df.median(axis=0))

Calculate the median along the column axis

To calculate the median along the column axis of the DataFrame, we can use the median() method with the axis parameter set to 1.

print("---------------------------")
print(df.median(axis=1))

Handling null values

If our DataFrame contains null values, we can handle them by specifying the skipna parameter. By default, skipna is set to True, which excludes null values when computing the median. If we want to include null values, we can set skipna to False.

df = pd.DataFrame({"A": [0, None, 78], "B": [77, 45, None], "C": [16, 23, None], "D": [17, 22, 56]})
print("------The DataFrame is------")
print(df)
print("---------------------------")
print(df.median(axis=0, skipna=False))

Summary

In this lab, we learned how to use the median() method in Pandas to calculate the median of values in a DataFrame. We explored how to calculate the median along the index axis and the column axis. Additionally, we learned how to handle null values when computing the median. Calculating the median is useful for understanding the central tendency of a dataset and can be used to make informed decisions in data analysis.

Other Python Tutorials you may like