How does pivot function work?

The pivot function in Pandas is used to reshape a DataFrame by transforming or "pivoting" data from a long format to a wide format. This is particularly useful when you want to reorganize your data for better analysis or visualization.

How the `pivot` Function Works

The pivot function takes three main parameters:

index: This parameter specifies the column(s) to use to make new frame’s index. It can be a single column or a list of columns.
columns: This parameter specifies the column whose unique values will become the new columns in the pivoted DataFrame.
values: This parameter specifies the column(s) to fill the new DataFrame with. If not specified, all remaining columns will be used.

Basic Syntax

DataFrame.pivot(index=None, columns=None, values=None)

Example

Let's say you have the following DataFrame:

import pandas as pd

data = {
    'date': ['2023-01-01', '2023-01-01', '2023-01-02', '2023-01-02'],
    'location': ['A', 'B', 'A', 'B'],
    'value': [10, 20, 15, 25]
}

df = pd.DataFrame(data)

This DataFrame looks like this:

date	location	value
2023-01-01	A	10
2023-01-01	B	20
2023-01-02	A	15
2023-01-02	B	25

To pivot this DataFrame so that each location becomes a column, you can use:

pivoted_df = df.pivot(index='date', columns='location', values='value')

Resulting DataFrame

The resulting DataFrame (pivoted_df) will look like this:

date	A	B
2023-01-01	10	20
2023-01-02	15	25

Key Points

Unique Values: The pivot function requires that the combination of the index and columns parameters must be unique. If there are duplicate entries, you will encounter a ValueError.
Alternative: If you have duplicates and want to aggregate them, consider using pivot_table, which allows for aggregation functions like mean, sum, etc.

Further Learning

To explore more about the pivot function and its applications, check out the Pandas Documentation.

If you have any more questions or need further clarification, feel free to ask!