Pandas | DatetimeIndex constructor
Start your free 7-days trial now!
Pandas DatetimeIndex constructor creates a new DatetimeIndex object, which is most often used as the index of a DataFrame.
Opt to use pd.date_range(~) method to initialise a DatetimeIndex rather than directly using this constructor. date_range(~) offers more flexibility and is more widely used in the official documentation.
Parameters
1. datalink | 1D array-like | optional
An array-like to construct the DatetimeIndex from.
2. freq | string | optional
The frequency of the DatetimeIndex. By default, freq=None.
3. tzlink | pytz.timezone or dateutil.tz.tzfile or datetime.tzinfo or string | optional
The timezone of DatetimeIndex. By default, the resulting DatetimeIndex is naive in the sense that it has no notion of timezones.
4. normalize | boolean | optional
Whether to set the time unit of the dates to midnight. By default, normalize=False.
Setting this parameter seems to not have any effect. If you encounter the same issue, use date_range(~)'s normalize instead.
5. closed | None or string | optional
Whether or not to make the bounds inclusive/exclusive:
Value | Description |
|---|---|
|
|
|
|
| Both bounds become inclusive. |
By default, closed=None.
6. ambiguouslink | string or array<boolean> | optional
This parameter is only relevant if you specified the timezone (tz). Due to time adjustments caused by Daylight Saving Time (DST), ambiguity in the time can arise. For instance, consider the following case:
Local time:01:59:5801:59:59 01:00:00 # DST ends and so we set the wall clock back 1 hour01:00:01...01:59:58 # This local time occured for the second time...
If you try to localize time that occurred twice (e.g. 01:59:58), then Pandas will get confused as to which time you're referring to - the first one (DST) or the second one (non-DST)?
Pandas can deal with such ambiguity in one of the following ways:
Value | Description |
|---|---|
| Infer the DST transition from the sequence of time provided. |
| An array (e.g. lists, Numpy array) of booleans where:
|
| Ambiguous times are converted into |
| Ambiguous times will raise an error. |
By default, ambiguous="raise".
7. dayfirst | boolean | optional
If True, then treat the first number as a day. For instance, "10/12/2020" will be parsed as December 10th, 2020. By default, dayfirst=False.
8. yearfirst | boolean | optional
If True, then treat the first number as a year. For instance, "20/12/10" will be parsed as December 10th, 2020. By default, yearfirst=False.
9. dtype | string or numpy.dtype or DatetimeTZDtype | optional
The dtype allowed is 'datetime64[ns]'. Instead of specifying tz, you can embed the timezone using dtype like so:
dtype="datetime64[ns, Europe/Paris]"
10. copy | boolean | optional
If
True, then a new copy ofdatawill be returned - modifying this copied data will not affect the original data, and vice versa.If
False, then a reference todatawill be returned - modifying the return value will mutate the original data, and vice versa.
By default, copy=False.
11. name | string | optional
The name assigned to the resulting DatetimeIndex. By default, name=None.
Return Value
A DatetimeIndex object.
Examples
Basic usage
To create a DatetimeIndex using a list of date strings:
idx
DatetimeIndex(['2020-12-25', '2020-12-26'], dtype='datetime64[ns]', freq=None)
We can use this to initialise a DataFrame with DatetimeIndex:
df
A2020-12-25 a2020-12-26 b
Specifying a timezone
To specify a timezone, set tz like so:
DatetimeIndex(['2020-12-25 00:00:00+09:00', '2020-12-26 00:00:00+09:00'], dtype='datetime64[ns, Asia/Tokyo]', freq=None)
Here, the appended +09:00 means that the standard time in Tokyo is 9 hours ahead of UTC.
Dealing with ambiguous times
At 2019-10-27 3AM (Central European Time), the DST ended, which means that the wall clock was turned back one hour. Therefore, we have an ambiguous case here where local times like 2019-10-27 02:30:00 occurred twice.
raise
By default, ambiguous="raise", which means that an error is raised if there are ambiguous dates:
'2019-10-27 02:00:00', '2019-10-27 02:30:00', '2019-10-27 03:00:00', '2019-10-27 03:30:00'], tz="CET")
AmbiguousTimeError: Cannot infer dst time from 2019-10-27 02:30:00, try using the 'ambiguous' argument
infer
Setting ambiguous="infer" tells Pandas to try to resolve the ambiguity from the sequence of dates supplied:
'2019-10-27 02:00:00', '2019-10-27 02:30:00', '2019-10-27 03:00:00', '2019-10-27 03:30:00'], tz="CET", ambiguous="infer")
DatetimeIndex(['2019-10-27 02:30:00+02:00', '2019-10-27 02:00:00+01:00', '2019-10-27 02:30:00+01:00', '2019-10-27 03:00:00+01:00', '2019-10-27 03:30:00+01:00'], dtype='datetime64[ns, CET]', freq=None)
Here, notice how Pandas has inferred that the first 02:30:00 is in DST, while the second 02:30:30 is non-DST.
Array of booleans
To explicitly tell Pandas whether a date should be parsed as DST or non-DST, pass an array of booleans where True indicates DST:
'2019-10-27 02:50:00'], tz="CET", ambiguous=[True, False])
DatetimeIndex(['2019-10-27 02:30:00+02:00', '2019-10-27 02:50:00+01:00'], dtype='datetime64[ns, CET]', freq=None)
NaT
To set ambiguous dates as NaT (not-a-time):
'2019-10-27 02:00:00', '2019-10-27 02:30:00', '2019-10-27 03:00:00', '2019-10-27 03:30:00'], tz="CET", ambiguous="NaT")
DatetimeIndex(['NaT', 'NaT', 'NaT', '2019-10-27 03:00:00+01:00', '2019-10-27 03:30:00+01:00'], dtype='datetime64[ns, CET]', freq=None)