PySpark
keyboard_arrow_down 147 guides
chevron_leftPySpark SQL Functions
Method arrayMethod colMethod collect_listMethod collect_setMethod concatMethod concat_wsMethod countMethod count_distinctMethod countDistinctMethod date_addMethod date_formatMethod dayofmonthMethod dayofweekMethod dayofyearMethod element_atMethod explodeMethod exprMethod firstMethod greatestMethod instrMethod isnanMethod lastMethod leastMethod lengthMethod litMethod lowerMethod maxMethod meanMethod minMethod monthMethod regexp_extractMethod regexp_replaceMethod repeatMethod roundMethod splitMethod to_dateMethod translateMethod trimMethod upperMethod whenMethod year
check_circle
Mark as learned thumb_up
0
thumb_down
0
chat_bubble_outline
0
Comment auto_stories Bi-column layout
settings
PySpark SQL Functions | date_add method
schedule Aug 12, 2023
Last updated local_offer
Tags PySpark
tocTable of Contents
expand_more Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!
Start your free 7-days trial now!
PySpark's date_add(-)
method adds the specified number of days to a date column.
Parameters
1. start
|
The column of starting dates.
2. days
| int
The number of days to add.
Return Value
A pyspark.sql.column.Column
object.
Examples
Basic usage
Consider the following DataFrame:
+----------+| my_date|+----------+|2023-04-20||2023-04-22|+----------+
To add 5 days to our column:
Adding a column of days to a column of dates
Unfortunately, the date_add(-)
method only accepts a constant for the second parameter. To add a column of days to a column of dates, we must take another approach.
To demonstrate, consider the following PySpark DataFrame:
+----------+-------+| my_date|my_days|+----------+-------+|2023-04-20| 3||2023-04-22| 5|+----------+-------+
To add my_days
to my_date
, supply the following SQL method in the F.expr()
method like so:
# Cast to INT first - by default, intgers have type BIGINT (F.expr(-) with raise an error)
+----------+-------+----------+| my_date|my_days| new_date|+----------+-------+----------+|2023-04-20| 3|2023-04-23||2023-04-22| 5|2023-04-27|+----------+-------+----------+
The resulting data type of the columns is as follows:
root |-- my_date: string (nullable = true) |-- days: integer (nullable = true) |-- new_date: date (nullable = true)
Notice how even though my_date
is of type string
, the resulting new_date
is of type date
.
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
Official PySpark Documentation
https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.functions.date_add.html
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!