Pandas DataFrame | melt method
Start your free 7-days trial now!
Pandas DataFrame.melt(~)
method converts the format of the source DataFrame from "wide" to "long".
Let's go through a quick example. Consider the following DataFrame:
name | age | height |
---|---|---|
alex | 40 | 150 |
bob | 50 | 160 |
This is considered to be a "wide" DataFrame since each row captures all relevant data about that person. Now, converting this to a "long" DataFrame:
name | variable | value |
---|---|---|
alex | age | 30 |
alex | height | 150 |
bob | age | 50 |
bob | height | 160 |
Now, each row captures a single variable about that person, which inevitably results in a vertically "long" DataFrame.
Pandas uses the term "unpivot" to denote the action of elongating the DataFrame based on a variable. In this example, we are unpivoting the variables age
and height
.
Parameters
1. id_vars
link | tuple
or list
or Numpy array
| optional
The label of the columns to be used as the identifier.
2. value_vars
link | tuple
or list
or Numpy array
| optional
The label of the columns to unpivot. By default, all columns will be unpivoted.
3. var_name
link | scalar
| optional
The label of the variable column. By default, var_name="variable"
.
4. value_name
link | scalar
| optional
The label of the value column. By default, value_name="value"
.
5. col_level
| int
or string
| optional
The level to perform the method on. This is only relevant for columns that are multi-index.
Return Value
A DataFrame that has been unpivoted.
Examples
Consider the following DataFrame:
df
name age height0 alex 40 1501 bob 50 1602 cathy 60 170
Basic usage
Here, the identifier column is name
, and the columns to unpivot are age
and height
:
df.melt(id_vars="name", value_vars=["age","height"])
name variable value0 alex age 401 bob age 502 cathy age 603 alex height 1504 bob height 1605 cathy height 170
Here, we can actually omit the value_vars
parameter since, by default, all columns except the identifier column (name
) will be unpivoted.
Specifying value_vars
Suppose we wanted to unpivot just one column instead of two columns. We can simply specify which columns to unpivot using the value_vars
parameter:
df.melt(id_vars="name", value_vars="age")
name variable value0 alex age 401 bob age 502 cathy age 60
Notice how the column height
has been stripped away. If we want to keep the height column in there, we must include it in the id_vars
parameter, like so:
df.melt(id_vars=["name","height"], value_vars="age")
name height variable value0 alex 150 age 401 bob 160 age 502 cathy 170 age 60
Specifying var_name and value_name
Here, we show df
again for your reference:
df
name age height0 alex 40 1501 bob 50 1602 cathy 60 170
As we've just seen in the examples above, the new column labels are "variable"
and "value"
by default, We can change this by specifying var_name
and value_name
, like so:
df.melt(id_vars="name", var_name="attribute", value_name="number")
name attribute number0 alex age 401 bob age 502 cathy age 603 alex height 1504 bob height 1605 cathy height 170