Pandas DataFrame | pct_change method
Start your free 7-days trial now!
Pandas DataFrame.pct_change(~)
computes the percentage change between consecutive values of each column of the DataFrame.
Parameters
1. periods
link | int
| optional
If periods=2
, then the percentage change will be computed using the values of two rows back. By default, periods=1
, which means that the value in the previous row will be used to compute the percentage change.
2. fill_method
link | string
| optional
The rule by which to fill missing values:
Value | Description |
---|---|
| Use the next non- |
| Use the previous non- |
By default, fill_method="pad"
.
Regardless of fill_method
, the first row will always have NaN
since there is no prior value to compute the percentage change.
3. limit
| int
| optional
The number of consecutive NaN
to fill before stopping to fill. By default, limit=None
.
4. freq
link | string
or timedelta
or DateOffset
| optional
The time interval to use for when DataFrame is a time-series. By default, freq=None
.
Return Value
A DataFrame
holding the percentage changes of the values in each column.
Examples
Basic usage
Consider the following DataFrame:
df
A B0 2 11 4 32 12 15
To compute the percentage change of consecutive values for each column in df
:
df.pct_change()
A B0 NaN NaN1 1.0 2.02 2.0 4.0
Here, note the following:
the first row is always
NaN
because there is no prior value with which to compute the percentage change.to explain how these percent changes are calculated, take for example the bottom-right value (
4.0
). This value is computed by taking the difference between the prior value indf
(15-3=12
), and then dividing this difference by the prior value (12/3=4.0
).
Specifying periods
Consider the following DataFrame:
df
A B0 2 11 4 32 12 15
By default, periods=1
, which means that the previous row is used to compute the percentage change:
df.pct_change() # periods=1
A B0 NaN NaN1 1.0 2.02 2.0 4.0
To compute the percentage change with values in 2 rows back:
df.pct_change(periods=2)
A B0 NaN NaN1 NaN NaN2 5.0 14.0
We get NaN
for the second row because there is no row to compare with.
To use the subsequent row to compute the percentage change, set periods=-1
.
Specifying fill_method
Consider the following DataFrame with some missing values:
df
A B0 2.0 1.01 NaN 3.02 12.0 NaN
pad
By default, fill_method="pad"
, which means that the previous non-NaN
value is used to fill NaN
:
df.pct_change() # fill_method="pad"
A B0 NaN NaN1 0.0 2.02 5.0 0.0
Note that this is equivalent to calling pct_change()
on the following:
A B0 2 11 2 32 12 3
Regardless of fill_method
, the first row will always have NaN
since there is no prior value with which to compute the percentage change.
bfill
To fill NaN
using the next non-NaN
value in the DataFrame:
df.pct_change(fill_method="bfill")
A B0 NaN NaN1 5.0 2.02 0.0 NaN
Note that this is equivalent to calling pct_change()
on the following:
A B0 2 11 12 32 12 NaN
Notice how the NaN
in the bottom-right corner is still a NaN
- this is because there exists no non-NaN
in the next row (there is no next row).
Specifying freq
Consider the following time-series DataFrame:
df
A B2020-12-20 2 12020-12-21 4 32020-12-22 12 152020-12-23 24 30
To compute the percentage change of every 2 days (e.g. 12-20
and 12-22
):
df.pct_change(freq="2D")
A B2020-12-20 NaN NaN2020-12-21 NaN NaN2020-12-22 5.0 14.02020-12-23 5.0 9.0
Here, we get NaN
values for the first 2 rows because there are no dates 12-18
and 12-19
with which to compute the percentage change.