Pandas DataFrame | where method
Start your free 7-days trial now!
Pandas DataFrame.where(~)
uses a boolean mask to selectively replace values in the source DataFrame.
Parameters
1. cond
| boolean
or array-like
or callable
| optional
A boolean mask, which is an array-like structure (e.g. Series and DataFrame) that contains either True
or False
as its entries.
If an entry is
True
, then the corresponding value in the source DataFrame will be left as is.If an entry if
False
, then the corresponding value in the source DataFrame will be replaced by that inother
.
If a callable
is passed, then the function takes as argument a DataFrame and returns a DataFrame of booleans. This callable must not modify the source DataFrame.
2. other
| scalar
or Series
or DataFrame
or function
| optional
The values to replace the entries that have True
in the cond
.
If a callable
is passed. then the function takes in as argument the value to be replaced and returns a new scalar, Series or DataFrame that will be the replacer. Once again, this callable must not modify the source DataFrame.
3. inplace
| boolean
| optional
Whether or not to perform the method inplace. Methods that are inplace means that they will directly modify the source DataFrame without creating and returning a new DataFrame. By default, inplace=False
.
4. axis
| int
| optional
The axis along which to perform the method. By default, axis=None
.
5. level
| int
| optional
The levels on which to perform the method. This is only relevant if your source DataFrame is a multi-index.
6. errors
| string
| optional
Whether to raise or suppress errors:
Value | Description |
---|---|
| Allow for errors to be raised. |
| When error occurs, return the source DataFrame. |
By default, errors="raise"
.
7. try_cast
| boolean
| optional
Whether or not to cast the resulting DataFrame into the source DataFrame's type. By default, try_cast=False
.
Return Value
A DataFrame
.
Examples
Basic usage
Consider the following DataFrame:
df = pd.DataFrame({"A":[3,4],"B":[5,6]})df
A B0 3 51 4 6
Suppose we have the following DataFrame that acts as the boolean mask:
df_mask = pd.DataFrame({"A":[True,False],"B":[False,True]})df_mask
A B0 True False1 False True
We then call where(~)
to selectively replace values in df
that where the corresponding entry in df_mask
is False
:
df.where(df_mask, 10)
A B0 3 101 10 6
Passing in a callable for cond
Consider the same DataFrame as before:
df = pd.DataFrame({"A":[3,4],"B":[5,6]})df
A B0 3 51 4 6
Instead of specifying an array-like mask as the first parameter, we can also pass in a function like so:
def foo(my_df): return my_df > 4
df.where(foo, 10)
A B0 10 51 10 6
Here, the function foo
takes in as argument the entire DataFrame, and returns a DataFrame of booleans. Again, a boolean of True
would mean that the corresponding values will be kept intact, while replacement is carried out for False
.
Note that the previous code snippet can be written compactly using lambdas:
df.where(lambda x : x > 4, 10)
A B0 10 51 10 6
Passing in callable for other
Consider the same DataFrame as before:
df = pd.DataFrame({"A":[3,4],"B":[5,6]})df
A B0 3 51 4 6
Suppose we have a mask like follows:
my_mask = [[True,False],[True,False]]my_mask
[[True, False], [True, False]]
Let us pass a callable for the other
parameter:
df.where(my_mask, lambda x : x + 10)
A B0 3 151 4 16
The callable takes in as argument the value to be replaced, and returns the new replacer.