The pivot function in Pandas is used to reshape a DataFrame by transforming or "pivoting" data from a long format to a wide format. This is particularly useful when you want to reorganize your data for better analysis or visualization.
How the pivot Function Works
The pivot function takes three main parameters:
index: This parameter specifies the column(s) to use to make new frame’s index. It can be a single column or a list of columns.columns: This parameter specifies the column whose unique values will become the new columns in the pivoted DataFrame.values: This parameter specifies the column(s) to fill the new DataFrame with. If not specified, all remaining columns will be used.
Basic Syntax
DataFrame.pivot(index=None, columns=None, values=None)
Example
Let's say you have the following DataFrame:
import pandas as pd
data = {
'date': ['2023-01-01', '2023-01-01', '2023-01-02', '2023-01-02'],
'location': ['A', 'B', 'A', 'B'],
'value': [10, 20, 15, 25]
}
df = pd.DataFrame(data)
This DataFrame looks like this:
| date | location | value |
|---|---|---|
| 2023-01-01 | A | 10 |
| 2023-01-01 | B | 20 |
| 2023-01-02 | A | 15 |
| 2023-01-02 | B | 25 |
To pivot this DataFrame so that each location becomes a column, you can use:
pivoted_df = df.pivot(index='date', columns='location', values='value')
Resulting DataFrame
The resulting DataFrame (pivoted_df) will look like this:
| date | A | B |
|---|---|---|
| 2023-01-01 | 10 | 20 |
| 2023-01-02 | 15 | 25 |
Key Points
- Unique Values: The
pivotfunction requires that the combination of theindexandcolumnsparameters must be unique. If there are duplicate entries, you will encounter aValueError. - Alternative: If you have duplicates and want to aggregate them, consider using
pivot_table, which allows for aggregation functions likemean,sum, etc.
Further Learning
To explore more about the pivot function and its applications, check out the Pandas Documentation.
If you have any more questions or need further clarification, feel free to ask!
