Pandas Series str | split method
Start your free 7-days trial now!
Pandas Series.str.split(~)
method performs a split on each string in the Series.
Parameters
1. pat
| string
| optional
The string or regular expression pattern to split the strings on. By default, pat=" "
(a single whitespace).
2. n
| int
| optional
The number of splits to allow for each value. By default, there is no limit. Note that parameter values None
, 0
or -1
will be interpreted as no limit.
3. expand
| boolean
| optional
If
True
, then the returned list is horizontally expanded to separate columns.If
False
, then a list is returned for each value.
By default, expand=False
.
Return Value
If expand=True
, then a DataFrame
/MultiIndex
is returned. Otherwise, a Series
/Index
is returned.
Examples
Basic usage
Consider the following Series:
s
0 a1 a_12 a_2dtype: object
To split each string by _
:
s.str.split("_")
0 [a]1 [a, 1]2 [a, 2]dtype: object
Notice how each value in the Series is now a list.
Using regex
Regex can be directly used as the separator:
s.str.split(r'[_*]')
0 [a, 1]1 [a, 2]dtype: object
Specifying n
By default, there is no limit as to how many splits can be made:
s.str.split("_")
0 [a, 1]1 [a, 2, 3]dtype: object
To allow at most 1
split to take place for each value:
s.str.split("_", n=1)
0 [a, 1]1 [a, 2_3]dtype: object
Specifying expand
By default, expand=False
, which means that each value becomes a list:
s.str.split("_")
0 [a]1 [a, 1]2 [a, 2]dtype: object
You can expand the list by setting expand=True
like so:
s.str.split("_", expand=True) # returns a DataFrame
0 10 a None1 a 12 a 2
Handling missing values
The result of a split for a individual missing value (NaN
) is also NaN
:
s.str.split("_")
0 [a, 1]1 NaNdtype: object