Pandas DataFrame | mad method
Start your free 7-days trial now!
Pandas DataFrame.mad(~)
method computes the mean absolute deviation (MAD) for each row or column of the DataFrame.
Note that MAD is calculated like follows:
Where,
$N$ is the number of data points in the row/column
$x_i$ is the $i$-th value in the row/column
$\bar{x}$ is the mean of the row/column
Parameters
1. axis
link | int
or string
| optional
Whether to compute the MAD row-wise or column-wise:
Axis | Description |
---|---|
| MAD is computed for each column. |
| MAD is computed for each row. |
By default, axis=0
.
2. skipna
link | boolean
| optional
Whether or not to skip NaN
. Skipped NaN
would not count towards the total size ($N$), which is the divisor when computing MAD. By default, skipna=True
.
3. level
| string
or int
| optional
The name or the integer index of the level to consider. This is relevant only if your DataFrame is Multi-index.
Return Value
If the level
parameter is specified, then a DataFrame
will be returned. Otherwise, a Series
will be returned.
Examples
Consider the following DataFrame:
df = pd.DataFrame({"A":[2,4,6],"B":[2,5,8]})df
A B0 2 21 4 52 6 8
Computing MAD column-wise
To compute MAD for each column:
df.mad()
A 1.333333B 2.000000dtype: float64
Computing MAD row-wise
To compute MAD for each row:
df.mad(axis=1)
0 0.01 0.52 1.0dtype: float64
Specifying skipna
Consider the following DataFrame:
df = pd.DataFrame({"A":[3,pd.np.nan,5]})df
A0 3.01 NaN2 5.0
By default, skipna=True
, which means that all missing values are ignored:
df.mad() # skipna=True
A 1.0dtype: float64
To consider missing values:
df.mad(skipna=False)
A NaNdtype: float64
With skipna=False
, if a row/column contains one or more missing values, then the MAD for that row/column will be NaN
.