Pandas | DatetimeIndex constructor
Start your free 7-days trial now!
Pandas DatetimeIndex
constructor creates a new DatetimeIndex
object, which is most often used as the index of a DataFrame.
Opt to use pd.date_range(~)
method to initialise a DatetimeIndex
rather than directly using this constructor. date_range(~)
offers more flexibility and is more widely used in the official documentation.
Parameters
1. data
link | 1D array-like
| optional
An array-like to construct the DatetimeIndex
from.
2. freq
| string
| optional
The frequency of the DatetimeIndex
. By default, freq=None
.
3. tz
link | pytz.timezone
or dateutil.tz.tzfile
or datetime.tzinfo
or string
| optional
The timezone of DatetimeIndex
. By default, the resulting DatetimeIndex
is naive in the sense that it has no notion of timezones.
4. normalize
| boolean
| optional
Whether to set the time unit of the dates to midnight. By default, normalize=False
.
Setting this parameter seems to not have any effect. If you encounter the same issue, use date_range(~)
's normalize
instead.
5. closed
| None
or string
| optional
Whether or not to make the bounds inclusive/exclusive:
Value | Description |
---|---|
|
|
|
|
| Both bounds become inclusive. |
By default, closed=None
.
6. ambiguous
link | string
or array<boolean>
| optional
This parameter is only relevant if you specified the timezone (tz
). Due to time adjustments caused by Daylight Saving Time (DST), ambiguity in the time can arise. For instance, consider the following case:
Local time:01:59:5801:59:59 01:00:00 # DST ends and so we set the wall clock back 1 hour01:00:01...01:59:58 # This local time occured for the second time...
If you try to localize time that occurred twice (e.g. 01:59:58
), then Pandas will get confused as to which time you're referring to - the first one (DST) or the second one (non-DST)?
Pandas can deal with such ambiguity in one of the following ways:
Value | Description |
---|---|
| Infer the DST transition from the sequence of time provided. |
| An array (e.g. lists, Numpy array) of booleans where:
|
| Ambiguous times are converted into |
| Ambiguous times will raise an error. |
By default, ambiguous="raise"
.
7. dayfirst
| boolean
| optional
If True
, then treat the first number as a day. For instance, "10/12/2020"
will be parsed as December 10th, 2020. By default, dayfirst=False
.
8. yearfirst
| boolean
| optional
If True
, then treat the first number as a year. For instance, "20/12/10"
will be parsed as December 10th, 2020. By default, yearfirst=False
.
9. dtype
| string
or numpy.dtype
or DatetimeTZDtype
| optional
The dtype
allowed is 'datetime64[ns]'
. Instead of specifying tz
, you can embed the timezone using dtype
like so:
dtype="datetime64[ns, Europe/Paris]"
10. copy
| boolean
| optional
If
True
, then a new copy ofdata
will be returned - modifying this copied data will not affect the original data, and vice versa.If
False
, then a reference todata
will be returned - modifying the return value will mutate the original data, and vice versa.
By default, copy=False
.
11. name
| string
| optional
The name assigned to the resulting DatetimeIndex
. By default, name=None
.
Return Value
A DatetimeIndex
object.
Examples
Basic usage
To create a DatetimeIndex
using a list of date strings:
idx
DatetimeIndex(['2020-12-25', '2020-12-26'], dtype='datetime64[ns]', freq=None)
We can use this to initialise a DataFrame with DatetimeIndex
:
df
A2020-12-25 a2020-12-26 b
Specifying a timezone
To specify a timezone, set tz
like so:
DatetimeIndex(['2020-12-25 00:00:00+09:00', '2020-12-26 00:00:00+09:00'], dtype='datetime64[ns, Asia/Tokyo]', freq=None)
Here, the appended +09:00
means that the standard time in Tokyo is 9 hours ahead of UTC.
Dealing with ambiguous times
At 2019-10-27 3AM
(Central European Time), the DST ended, which means that the wall clock was turned back one hour. Therefore, we have an ambiguous case here where local times like 2019-10-27 02:30:00
occurred twice.
raise
By default, ambiguous="raise"
, which means that an error is raised if there are ambiguous dates:
'2019-10-27 02:00:00', '2019-10-27 02:30:00', '2019-10-27 03:00:00', '2019-10-27 03:30:00'], tz="CET")
AmbiguousTimeError: Cannot infer dst time from 2019-10-27 02:30:00, try using the 'ambiguous' argument
infer
Setting ambiguous="infer"
tells Pandas to try to resolve the ambiguity from the sequence of dates supplied:
'2019-10-27 02:00:00', '2019-10-27 02:30:00', '2019-10-27 03:00:00', '2019-10-27 03:30:00'], tz="CET", ambiguous="infer")
DatetimeIndex(['2019-10-27 02:30:00+02:00', '2019-10-27 02:00:00+01:00', '2019-10-27 02:30:00+01:00', '2019-10-27 03:00:00+01:00', '2019-10-27 03:30:00+01:00'], dtype='datetime64[ns, CET]', freq=None)
Here, notice how Pandas has inferred that the first 02:30:00
is in DST, while the second 02:30:30
is non-DST.
Array of booleans
To explicitly tell Pandas whether a date should be parsed as DST or non-DST, pass an array of booleans where True
indicates DST:
'2019-10-27 02:50:00'], tz="CET", ambiguous=[True, False])
DatetimeIndex(['2019-10-27 02:30:00+02:00', '2019-10-27 02:50:00+01:00'], dtype='datetime64[ns, CET]', freq=None)
NaT
To set ambiguous dates as NaT
(not-a-time):
'2019-10-27 02:00:00', '2019-10-27 02:30:00', '2019-10-27 03:00:00', '2019-10-27 03:30:00'], tz="CET", ambiguous="NaT")
DatetimeIndex(['NaT', 'NaT', 'NaT', '2019-10-27 03:00:00+01:00', '2019-10-27 03:30:00+01:00'], dtype='datetime64[ns, CET]', freq=None)