Pandas DataFrame | cumsum method
Start your free 7-days trial now!
Pandas DataFrame.cumsum(~)
method computes the cumulative sum along the row or column of the source DataFrame.
Parameters
1. axis
link | int
or string
| optional
Whether to compute the cumulative sum along the row or the column:
Axis | Description |
---|---|
| Compute the cumulative sum of each column. |
| Compute the cumulative sum of each row. |
By default, axis=0
.
2. skipna
link | boolean
| optional
Whether or not to ignore NaN
. By default, skipna=True
.
Return Value
A DataFrame holding the cumulative sum of the row or columns values.
Examples
Consider the following DataFrame:
df
A B0 3 51 4 6
Cumulative sum of each column
To compute the cumulative sum for each column:
df.cumsum()
A B0 3 51 7 11
Cumulative sum of each row
To compute the cumulative sum for each row:
df.cumsum(axis=1)
A B0 3 81 4 10
Dealing with missing values
Consider the following DataFrame with a missing value:
df
A0 3.01 NaN2 5.0
By default, skipna=True
, which means that missing values are skipped and do not mutate the sum:
df.cumsum() # skipna=True
A0 3.01 NaN2 8.0
To take into account the missing values:
df.cumsum(skipna=False)
A0 3.01 NaN2 NaN
Here, notice how we end up with a NaN
after the first NaN
. This is because the sum of a scalar and a NaN
in Pandas is a NaN
, that is:
5 + pd.np.NaN
nan