PySpark SQL Functions | instr method
Start your free 7-days trial now!
PySpark SQL Functions' instr(~) method returns a new PySpark Column holding the position of the first occurrence of the specified substring in each value of the specified column.
The position is not index-based, and starts from 1 instead of 0.
Parameters
1. str | string or Column
The column to perform the operation on.
2. substr | string
The substring of which to check the position.
Return Value
A PySpark DataFrame.
Examples
Consider the following PySpark DataFrame:
+----+| x|+----+| ABA|| BBB|| CCC||null|+----+
Getting the position of the first occurrence of a substring in PySpark Column
To get the position of the first occurrence of the substring "B" in column x, use the instr(~) method:
Here, note the following:
we see
2returned for the column value"ABA"because the substring"B"occurs in the 2nd position - remember, this method counts position from1instead of0.if the substring does not exist in the string, then a value of
0is returned. This is the case for"Cathy"because this string does not include"B".if the string is
null, then the result will also benull.