To use the melt function in pandas, follow these steps:
- Import pandas: Ensure you have the pandas library imported.
- Create a DataFrame: Define your DataFrame with the data you want to melt.
- Call the
meltfunction: Use themeltfunction on the DataFrame, specifying the necessary parameters.
Syntax
pd.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)
Key Parameters
- frame: The DataFrame you want to melt.
- id_vars: Columns to keep as identifier variables (these will not be melted).
- value_vars: Columns to melt into a single column of values. If not specified, all columns not in
id_varswill be melted. - var_name: Name of the new column that will hold the names of the melted columns.
- value_name: Name of the new column that will hold the values from the melted columns.
Example
Here’s a step-by-step example:
-
Import pandas:
import pandas as pd -
Create a DataFrame:
data = { 'Name': ['Alice', 'Bob'], 'Math': [85, 78], 'Science': [90, 82], 'English': [88, 80] } df = pd.DataFrame(data) -
Use the
meltfunction:melted_df = df.melt(id_vars='Name', var_name='Subject', value_name='Score')
Resulting DataFrame
The resulting melted_df will look like this:
| Name | Subject | Score |
|---|---|---|
| Alice | Math | 85 |
| Alice | Science | 90 |
| Alice | English | 88 |
| Bob | Math | 78 |
| Bob | Science | 82 |
| Bob | English | 80 |
Explanation
id_vars='Name': TheNamecolumn is specified as the identifier variable, so it remains unchanged and is repeated for each subject.var_name='Subject': This parameter specifies the name of the new column that will hold the names of the melted columns (Math, Science, English).value_name='Score': This parameter specifies the name of the new column that will hold the values from the melted columns.
Summary
The melt function is useful for transforming a DataFrame from a wide format to a long format, making it easier to analyze and visualize data, especially in scenarios where you need to work with multiple measurements or categories.
