PySpark SQL Functions | to_date method
Start your free 7-days trial now!
PySpark SQL Functions' to_date()
method converts date strings to date types.
Parameters
1. col
| Column
The date string column.
2. format
| string
The format of the date string.
Return Value
A PySpark Column.
Examples
Consider the following PySpark DataFrame with some date strings:
+----+----------+|name| birthday|+----+----------+|Alex|1995-12-16|| Bob|1998-05-06|+----+----------+
Converting date strings to date type in PySpark
To convert date strings in the birthday
column to actual date
type, use to_date(~)
and specify the pattern of the date string:
from pyspark.sql import functions as F
root |-- name: string (nullable = true) |-- birthday: date (nullable = true)
Here, the withColumn(~)
method is used to update the birthday
column using the new column returned by to_date(~)
.
As another example, here's a PySpark DataFrame with slightly more complicated date strings:
df = spark.createDataFrame([["Alex", "1995/12/16 16:20:20"], ["Bob", "1998/05/06 18:56:10"]], ["name", "birthday"])
+----+----------+|name| birthday|+----+----------+|Alex|1995-12-16|| Bob|1998-05-06|+----+----------+
Here, our date strings also contain hours, minutes and seconds.
To convert the birthday
column to date
type:
+----+----------+|name| birthday|+----+----------+|Alex|1995-12-16|| Bob|1998-05-06|+----+----------+
Here, notice how information about the hours, minutes and seconds unit have been lost during the type conversion.