Pandas DataFrame | agg method
Start your free 7-days trial now!
Pandas DataFrame.agg(~)
applies the specified function to each row or column of the DataFrame.
Parameters
1. func
| string
or list
or dict
or function
The function to use for aggregating:
Type | Example |
---|---|
|
|
Function name as a |
|
|
|
|
|
Built-in functions that can be used for aggregating are as follows:
Built-in aggregates | Description |
---|---|
| sum |
| product of values |
| number of values |
| number of non- |
| mean |
| variance |
| standard deviation |
| unbiased standard error of mean |
| mean absolute deviation |
| minimum |
| maximum |
| median |
| mode |
| quantile |
| absolute value |
| unbiased skewness |
| unbiased kurtosis |
| cumulative sum |
| cumulative product |
| cumulative max |
| cumulative min |
2. axis
link | int
or string
| optional
Whether or to apply the function column-wise or row-wise:
Axis | Description |
---|---|
|
|
|
|
By default, axis=0
.
3. *args
Positional arguments to pass to func
.
4. **kwargs
Keyword arguments to pass to func
.
Return Value
A new scalar
, Series
or a DataFrame
depending on the func
that is passed.
Examples
Consider the following DataFrame:
df
A B0 1 31 2 4
Computing a single aggregate
To compute the mean of each column:
df.agg("mean") # Equivalent to df.agg(pd.np.mean)
A 1.5B 3.5dtype: float64
Here, the returned data type is Series
.
Computing multiple aggregates
To compute the mean as well as the minimum of each column:
df.agg(["mean", "min"])
A Bmean 1.5 3.5min 1.0 3.0
Here, the returned data type is DataFrame
.
Computing aggregates for a subset of columns
To compute the minimum of just column A
:
df.agg({"A":"min"}) # Returns a Series
A 1dtype: int64
Computing aggregates row-wise
To compute the maximum of every row, set axis=1
like so:
df.agg("max", axis=1) # Returns a Series
0 31 4dtype: int64
Defining custom aggregate functions
Consider the following DataFrame:
df
A B0 1 31 2 4
We can pass a custom function that serves as the aggregate:
df.agg(lambda col: 2 * sum(col))
A 6B 14dtype: int64
Here, x
is a Series
that represent a column of df
.
Passing in additional parameters
We can pass in additional parameters to func
like so:
def foo(col, x): return col + x
df.agg(foo, x=5)
A B0 6 81 7 9