search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to

Pandas DataFrame | mask method

schedule Aug 11, 2023
Last updated
local_offer
PythonPandas
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Pandas DataFrame.mask(~) replaces all values in the DataFrame that pass a certain criteria with the desired value.

Parameters

1. cond | array-like of booleans

A boolean mask, which is an array-like structure (e.g. Series and DataFrame) that contains either True or False as its entries.

2. other | number or string or Series or DataFrame

The values to replace the entries that have True in cond.

3. inplace | boolean | optional

  • If True, then the method will directly modify the source DataFrame instead of creating a new DataFrame.

  • If False, then a new DataFrame will be created and returned.

By default, inplace=False.

4. axis | int | optional

The axis along which to perform the method. By default, axis=None.

5. level | int | optional

The levels on which to perform the method. This is only relevant if your source DataFrame is a multi-index.

6. errors | string | optional

Whether to raise or suppress errors:

Value

Description

"raise"

Allow for errors to be raised.

"ignore"

When error occurs, return the source DataFrame.

By default, errors="raise".

7. try_cast | boolean | optional

Whether or not to cast the resulting DataFrame into the source DataFrame's type. By default, try_cast=False.

Return Value

A DataFrame with values replaced according to your parameters. Note that the shape is the same as that of the source DataFrame.

Examples

Applying custom masks

Consider the following DataFrame:

df = pd.DataFrame({"A":[1,2], "B":[3,4]})
df
A B
0 1 3
1 2 4

Our goal is to replace all values greater than 2 with the value 5 using the mask(~) method.

In order to use mask(~), we first need to prepare the mask like so:

df_mask = df > 2
A B
0 False True
1 False True

Notice how all the values that fit our condition (value > 2) are flagged as True, and those that aren't as False.

Finally, we apply the mask like so:

df.mask(df_mask, 5)
A B
0 1 5
1 2 5

We see that all values greater than 2 (values 3 and 4 in this case) have been replaced by 5.

Applying Pandas built-in masks

Consider the following DataFrame:

df = pd.DataFrame({"A": [pd.np.NaN,2], "B":[3,pd.np.NaN]})
df
A B
0 NaN 3.0
1 2.0 NaN

Our df contains two missing values. Our goal is to replace these missing values with a value of 5.

Instead of creating our own boolean mask like we did before, we can leverage Panda DataFrame.isna(~) method:

df.isna()
A B
0 True False
1 False True

We can perform the masking operation directly like so:

df.mask(df.isna(), 5)
A B
0 5.0 3.0
1 2.0 5.0

Note that this is just an example to illustrate the use of mask(~) - to fill missing values, opt to use fillna(~) instead.

Using a DataFrame as the replacer

In the previous two examples, we have simply replaced all values fulfilling a certain criteria by a single number. The mask(~) method can also take a DataFrame, which is used when you have multiple values as the replacer.

As an example, consider the following DataFrame:

df = pd.DataFrame({"A":[1,2],"B":[3,4]})
df
A B
0 1 3
1 2 4

Once again, let's say we want to modify all values that are greater than 2.

We prepare the mask like so:

df_mask = df > 2
A B
0 False True
1 False True

Next, we create the DataFrame to use as our replacer:

df_replacer = pd.DataFrame({"A":[5,6], "B":[7,8]})
df_replacer
A B
0 5 7
1 6 8

Finally, use the mask(~) method to apply our mask:

df.mask(df_mask, df_replacer)
A B
0 1 7
1 2 8

Notice how values in df that were flagged as True in df_mask were replaced by the corresponding entry in df_replacer.

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
4
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!