Pandas DataFrame Pivot Method

PythonPythonBeginner
Practice Now

Introduction

In this lab, we will learn how to use the pivot() method in the Python Pandas library. The pivot() method allows us to transform or reshape a DataFrame by changing the organization of the index and column values.

VM Tips

After the VM startup is done, click the top left corner to switch to the Notebook tab to access Jupyter Notebook for practice.

Sometimes, you may need to wait a few seconds for Jupyter Notebook to finish loading. The validation of operations cannot be automated because of limitations in Jupyter Notebook.

If you face issues during learning, feel free to ask Labby. Provide feedback after the session, and we will promptly resolve the problem for you.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL pandas(("`Pandas`")) -.-> pandas/DataSelectionGroup(["`Data Selection`"]) pandas(("`Pandas`")) -.-> pandas/AdvancedOperationsGroup(["`Advanced Operations`"]) python(("`Python`")) -.-> python/DataStructuresGroup(["`Data Structures`"]) python(("`Python`")) -.-> python/ModulesandPackagesGroup(["`Modules and Packages`"]) python(("`Python`")) -.-> python/DataScienceandMachineLearningGroup(["`Data Science and Machine Learning`"]) python(("`Python`")) -.-> python/FunctionsGroup(["`Functions`"]) pandas/DataSelectionGroup -.-> pandas/select_columns("`Select Columns`") pandas/AdvancedOperationsGroup -.-> pandas/reshape_data("`Reshaping Data`") python/DataStructuresGroup -.-> python/lists("`Lists`") python/DataStructuresGroup -.-> python/tuples("`Tuples`") python/DataStructuresGroup -.-> python/dictionaries("`Dictionaries`") python/ModulesandPackagesGroup -.-> python/importing_modules("`Importing Modules`") python/DataScienceandMachineLearningGroup -.-> python/numerical_computing("`Numerical Computing`") python/DataScienceandMachineLearningGroup -.-> python/data_analysis("`Data Analysis`") python/FunctionsGroup -.-> python/build_in_functions("`Build-in Functions`") subgraph Lab Skills pandas/select_columns -.-> lab-68692{{"`Pandas DataFrame Pivot Method`"}} pandas/reshape_data -.-> lab-68692{{"`Pandas DataFrame Pivot Method`"}} python/lists -.-> lab-68692{{"`Pandas DataFrame Pivot Method`"}} python/tuples -.-> lab-68692{{"`Pandas DataFrame Pivot Method`"}} python/dictionaries -.-> lab-68692{{"`Pandas DataFrame Pivot Method`"}} python/importing_modules -.-> lab-68692{{"`Pandas DataFrame Pivot Method`"}} python/numerical_computing -.-> lab-68692{{"`Pandas DataFrame Pivot Method`"}} python/data_analysis -.-> lab-68692{{"`Pandas DataFrame Pivot Method`"}} python/build_in_functions -.-> lab-68692{{"`Pandas DataFrame Pivot Method`"}} end

Importing pandas and creating the DataFrame

  • Start by importing the pandas library and creating a DataFrame using the pd.DataFrame() function.
import pandas as pd

data = {
  'crop': ['Rice', 'Wheat', 'Rice', 'Wheat', 'Rice', 'Wheat'],
  'state': ['karnataka', 'karnataka', 'Tamilnadu', 'Tamilnadu', 'Kerala', 'Kerala'],
  'Temperature': [29, 29, 31, 31, 25, 25],
  'Humidity': [50, 50, 62, 62, 45, 45]
}

df = pd.DataFrame(data)
print(df)
  • This will create a DataFrame with columns for 'crop', 'state', 'Temperature', and 'Humidity'.

Reshape the DataFrame using the pivot() method

  • To reshape the DataFrame, we can use the pivot() method and specify the index and column names.
df_pivot = df.pivot(index='crop', columns='state')
print(df_pivot)
  • The pivot() method will rearrange the DataFrame, using 'crop' as the new index and 'state' as the new column. The resulting DataFrame will have 'Temperature' and 'Humidity' as columns for each combination of 'crop' and 'state'.

Specify the values parameter to select specific columns

  • If we only want to include specific columns in the reshaped DataFrame, we can use the values parameter in the pivot() method.
df_pivot_specific = df.pivot(index='crop', columns='state', values='Temperature')
print(df_pivot_specific)
  • The resulting DataFrame will only include the 'Temperature' column for each combination of 'crop' and 'state'.

Handle duplicates in the DataFrame

  • If the DataFrame contains duplicates, the pivot() method will raise a ValueError. In such cases, we need to ensure that the DataFrame does not have duplicate entries before reshaping.
df_duplicated = pd.DataFrame({'crop': ['Rice', 'Rice', 'Wheat', 'Wheat', 'Rice', 'Wheat'],
                              'state': ['karnataka', 'karnataka', 'Tamilnadu', 'Tamilnadu', 'Kerala', 'Kerala'],
                              'Temperature': [29, 29, 31, 31, 25, 25],
                              'Humidity': [50, 50, 62, 62, 45, 45]})

df_duplicated_pivot = df_duplicated.pivot(index='crop', columns='state', values='Temperature')
print(df_duplicated_pivot)
  • In this example, the DataFrame contains duplicate entries for the combination of 'crop' and 'state', which will result in a ValueError when using the pivot() method.

Summary

This lab covered the basic usage of the pivot() method in the Python Pandas library. The pivot() method allows us to transform or reshape a DataFrame by changing the organization of the index and column values. We learned how to reshape a DataFrame, select specific columns, and handle duplicates. The pivot() method is a powerful tool for data manipulation and analysis.

Other Python Tutorials you may like