search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to

Pandas Series str | extractall method

schedule Aug 12, 2023
Last updated
local_offer
PythonPandas
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Pandas Series' str.extractall(~) extracts all the matched substrings using regular expression.

NOTE

To extract the first match instead of all matches, use str.extract(~).

Parameters

1. patlink | str

Regular expression to match.

2. flags | int | optional

The flags to set from the re library (e.g. re.IGNORECASE). Multiple flags can be set by combining them with the bitwise | (e.g. re.IGNORECASE | re.MULTILINE).

Return Value

A multi-index DataFrame.

Examples

Basic usage

Consider the following DataFrame:

import pandas as pd
df = pd.DataFrame({'A':['k23','45k','67k89']}, index=['a','b','c'])
df
A
a k23
b 45k
c 67k89

To get extract substrings that match a given regex:

df['A'].str.extractall('(\d+)') # returns a multi-index DataFrame
0
match
a 0 23
b 0 45
c 0 67
1 89

Here, the input string is a regex, and \d+ indicates a number, while () indicates the portion we want to extract.

Since the resulting DataFrame is a multi-index, we can obtain the matches for specific indexes like so:

df_result = df['A'].str.extractall('(\d+)')
df_result.loc['c']
0
match
0 67
1 89

Multiple capturing groups

Consider the following DataFrame:

import pandas as pd
df = pd.DataFrame({'A':['k23','45y','67k89']}, index=['a','b','c'])
df
A
a k23
b 45y
c 67k89

We can capture multiple groups using multiple brackets:

df['A'].str.extractall('(\d+)([ky])')
0 1
match
b 0 45 y
c 0 67 k
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
1
thumb_down
1
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!