Pandas DataFrame Pivot Table Method

Beginner

Introduction

In this lab, we will learn about the pivot_table() method in the Python pandas library. The pivot_table() method is used to aggregate and summarize data in a DataFrame. It returns a spreadsheet-style pivot table as a new DataFrame.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.

Import the required libraries and create the DataFrame

First, let's import the pandas library and create a DataFrame with some sample data. We will create a DataFrame with columns 'Date', 'State', 'Temperature', and 'Humidity'.

import pandas as pd

df = pd.DataFrame({'Date': ['1/1/2021', '1/1/2021', '2/1/2021', '2/1/2021', '1/1/2021', '1/1/2021', '2/1/2021', '2/1/2021'],
                   'State': ['Karnataka', 'Karnataka', 'Karnataka', 'Karnataka', 'Tamilnadu', 'Tamilnadu', 'Tamilnadu', 'Tamilnadu'],
                   'Temperature': [25, 29, 28, 31, 26, 27, 22, 32],
                   'Humidity': [46, 50, 52, 59, 42, 45, 46, 43]})

Aggregate the DataFrame using the pivot_table() method

To aggregate the data in the DataFrame using the pivot_table() method, we need to specify the columns we want to use as indices, columns, and the values we want to aggregate.

pivot_df = df.pivot_table(index='Date', columns='State', aggfunc='mean')

Display the resulting DataFrame

Finally, let's display the resulting pivot table DataFrame.

print(pivot_df)

Summary

By following these steps, we were able to use the pivot_table() method in the pandas library to aggregate and summarize data in a DataFrame. This method is useful for analyzing and visualizing data in a tabular format. The resulting pivot table DataFrame provides a convenient way to see the aggregated values based on different indices and columns.