PySpark
keyboard_arrow_down 147 guides
chevron_leftPySpark Column
check_circle
Mark as learned thumb_up
1
thumb_down
0
chat_bubble_outline
0
Comment auto_stories Bi-column layout
settings
PySpark Column | startswith method
schedule Aug 12, 2023
Last updated local_offer
Tags PySpark
tocTable of Contents
expand_more Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!
Start your free 7-days trial now!
PySpark Column's startswith(~)
method returns a column of booleans where True
is given to strings that begin with the specified substring.
Parameters
1. other
| string
or Column
The substring or column to compare with.
Return Value
A Column
object holding booleans.
Examples
Consider the following PySpark DataFrame:
+-----+---+| name|age|+-----+---+| Alex| 20|| Bob| 30||Cathy| 40|+-----+---+
Getting rows that start with a certain substring in PySpark DataFrame
To get rows that start with a certain substring:
Here, F.col("name").startswith("A")
returns a Column
object of booleans where True
corresponds to values that begin with A
:
+-------------------+|startswith(name, A)|+-------------------+| true|| false|| false|+-------------------+
We then use the PySpark DataFrame's filter(~)
method to fetch rows that correspond to True
.
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
Official PySpark Documentation
https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.sql.Column.startswith.html
thumb_up
1
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!