PySpark SQL Functions | instr method
Start your free 7-days trial now!
PySpark SQL Functions' instr(~)
method returns a new PySpark Column holding the position of the first occurrence of the specified substring in each value of the specified column.
The position is not index-based, and starts from 1 instead of 0.
Parameters
1. str
| string
or Column
The column to perform the operation on.
2. substr
| string
The substring of which to check the position.
Return Value
A PySpark DataFrame.
Examples
Consider the following PySpark DataFrame:
+----+| x|+----+| ABA|| BBB|| CCC||null|+----+
Getting the position of the first occurrence of a substring in PySpark Column
To get the position of the first occurrence of the substring "B"
in column x
, use the instr(~)
method:
Here, note the following:
we see
2
returned for the column value"ABA"
because the substring"B"
occurs in the 2nd position - remember, this method counts position from1
instead of0
.if the substring does not exist in the string, then a value of
0
is returned. This is the case for"Cathy"
because this string does not include"B"
.if the string is
null
, then the result will also benull
.