Pandas DataFrame | unstack method
Start your free 7-days trial now!
Pandas DataFrame.unstack(~)
method converts the specified row levels to column levels. This is the reverse of stack(~)
.
Parameters
1. level
link | int
or string
or list
of such | optional
The integer index or name(s) of the row level to convert into a column level. By default, level=-1
, which means that the inner-most row level is converted.
2. fill_value
link | int
or string
or dict
| optional
The value to fill NaN
in the resulting Series/DataFrame. Note that NaN
in the original DataFrame will not be filled - only those that appear due to this method will be filled. By default, the NaN
is left as is.
Return Value
A Series
or a DataFrame
.
Examples
Unstacking single-level DataFrames
Consider the following single-level DataFrame:
df = pd.DataFrame({"age":[2,3],"height":[4,5]}, index=["alice","bob"])df
age heightalice 2 4bob 3 5
Calling unstack()
on df
gives:
df.unstack()
age alice 2 bob 3height alice 4 bob 5dtype: int64
Here, note the following:
the return type is
Series
, with two levels.the row labels and the column labels in
df
have merged to form a multi-index.
Unstacking DataFrames with multi-level rows
Consider the following DataFrame with multi-level rows:
index = [("A","alice"), ("A","bob"),("B","cathy"),("B","david")]multi_index = pd.MultiIndex.from_tuples(index)df = pd.DataFrame({"age":[2,3,4,5],"height":[6,7,8,9]}, index=multi_index)df
age heightA alice 2 6 bob 3 7B cathy 4 8 david 5 9
By default, level=-1
, which means that the inner-most row level ([alice,bob,cathy,david]
) will be converted into a column level:
df.unstack()
age height alice bob cathy david alice bob cathy davidA 2.0 3.0 NaN NaN 6.0 7.0 NaN NaNB NaN NaN 4.0 5.0 NaN NaN 8.0 9.0
Note the following:
the inner-most row level (
[alice, bob, cathy, david]
) became a column level, and is positioned as the inner-most level.stacking columns with multi-levels often yield many
NaN
since, for instance, no data exists about theage
ofalice
in groupB
.
To specify which levels to convert, pass the level
parameter like so:
df.unstack(level=0)
age height A B A Balice 2.0 NaN 6.0 NaNbob 3.0 NaN 7.0 NaNcathy NaN 4.0 NaN 8.0david NaN 5.0 NaN 9.0
Here, level=0
means that the outer-most row level ([A,B]
) is converted into a column level.
Specifying fill_value
By default, fill_value=None
, which means that NaN
in the resulting Series/DataFrame is left as is.
To fill all NaN
with a value instead, pass in fill_value
like so:
df.unstack(level=0, fill_value="@")
age height A B A Balice 2 @ 6 @bob 3 @ 7 @cathy @ 4 @ 8david @ 5 @ 9
Note that NaN
that pre-existed in the original DataFrame will not be filled - only those caused by this unstacking process will be filled by fill_value
.