Pandas DataFrame | apply method
Start your free 7-days trial now!
Pandas DataFrame.apply(~)
applies the specified function to each row or column of the DataFrame.
Parameters
1. func
| function
The function to apply along the rows or columns.
2. axis
| string
or int
| optional
The axis along which to perform the function:
Axis | Description |
---|---|
| Function will be applied to each column. |
| Function will be applied to each row. |
By default, axis=0
.
3. raw
| boolean
| optional
If
True
, then a NumPy array will be passed as the argument forfunc
.If
False
, then a Series will be passed instead.
Performance-wise, if you're applying a reductive Numpy function such as np.sum
, then opt for raw=True
. By default, raw=False
.
4. result_type
link | string
or None
| optional
How to parse list-like return values of func
. This is only relevant when axis=1
(when func
is applied row-wise):
Value | Description |
---|---|
| Values of list-like results (e.g. |
| Values of list-like results will be reduced to a single Series. |
| Values of list-like results will be separated out into columns, but unlike |
| Behaviour depends on the value returned by your function. If a |
By default, result_type=None
. Consult the examples below for clarification.
5. args
| tuple
| optional
Additional positional arguments you want to supply to your func
.
6. **kwds
| optional
Additional keyword arguments you want to supply to your func
.
Return Value
The resulting Series
or DataFrame
after applying your function.
Examples
Applying function on columns
Consider the following DataFrame:
df
A B0 2 41 3 5
To apply the np.sum
function column-wise:
A 5B 9dtype: int64
Pandas can benefit from performance gains if you set raw=True
when applying a NumPy reductive function like np.sum
.
Applying function on rows
Consider the same DataFrame as before:
df
A B0 2 41 3 5
To apply the np.sum
function row-wise, set axis=1
:
0 61 8dtype: int64
Applying built-in bundlers
Consider the same DataFrame as before:
df
A B0 2 41 3 5
You could bundle values using built-in functions such as tuple
, list
and even Series
:
df.apply(tuple)
A (2, 3)B (4, 5)dtype: object
Applying a custom function
Consider the same DataFrame as before:
df
A B0 2 41 3 5
To apply a custom function:
def foo(col): return 2 * col
df.apply(foo)
A B0 4 81 6 10
Our function foo
takes in as argument a column (axis=0
) of type Series
, and returns the transformed column as a Series
.
Passing in keyword arguments
To pass in keyword arguments to func
:
def foo(col, x): return x * col
df.apply(foo, x=2)
A B0 4 81 6 10
Different ways of parsing list-like return values
Consider the following DataFrame:
df
A Ba 4 6b 5 7
The parameter result_type
comes into play when the return type of the function is list-like
.
Return type None
When return_type
is not set, then the default behaviour is to place list-like
return values in a Series
:
df.apply(lambda x: [1,2,3], axis=1) # Returns a Series
a [1, 2, 3]b [1, 2, 3]dtype: object
Note that lambda x: [1,2,3]
is equivalent to the following:
def foo(x): # The function name isn't important here return [1,2,3]
Return type expand
When return_type="expand"
, then the values of a list-like
will be separated out into columns, resulting in a DataFrame
:
df.apply(lambda x: [1,2,3], axis=1, result_type="expand") # Returns a DataFrame
0 1 2a 1 2 3b 1 2 3
Notice how we no longer have our original column names A
and B
.
Return type broadcast
When return_type="broadcast"
, then the list-like values will be separated out into columns, but unlike "expand"
, the column names will be retained:
df.apply(lambda x: [1,2], axis=1, result_type="broadcast") # Returns a DataFrame
A Ba 1 2b 1 2
For this to work, the length of list-like
must be equal to the number of columns in the source DataFrame. This means that returning [1,2,3]
instead of [1,2]
in this case would result in an error.