search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to

Pandas | DatetimeIndex constructor

schedule Aug 10, 2023
Last updated
local_offer
PythonPandas
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Pandas DatetimeIndex constructor creates a new DatetimeIndex object, which is most often used as the index of a DataFrame.

WARNING

Opt to use pd.date_range(~) method to initialise a DatetimeIndex rather than directly using this constructor. date_range(~) offers more flexibility and is more widely used in the official documentation.

Parameters

1. datalink | 1D array-like | optional

An array-like to construct the DatetimeIndex from.

2. freq | string | optional

The frequency of the DatetimeIndex. By default, freq=None.

3. tzlink | pytz.timezone or dateutil.tz.tzfile or datetime.tzinfo or string | optional

The timezone of DatetimeIndex. By default, the resulting DatetimeIndex is naive in the sense that it has no notion of timezones.

4. normalize | boolean | optional

Whether to set the time unit of the dates to midnight. By default, normalize=False.

WARNING

Setting this parameter seems to not have any effect. If you encounter the same issue, use date_range(~)'s normalize instead.

5. closed | None or string | optional

Whether or not to make the bounds inclusive/exclusive:

Value

Description

"left"

  • The left endpoint becomes inclusive.

  • The right endpoint becomes exclusive.

"right"

  • The left endpoint becomes exclusive.

  • The right endpoint becomes inclusive.

None

Both bounds become inclusive.

By default, closed=None.

6. ambiguouslink | string or array<boolean> | optional

This parameter is only relevant if you specified the timezone (tz). Due to time adjustments caused by Daylight Saving Time (DST), ambiguity in the time can arise. For instance, consider the following case:

Local time:
01:59:58
01:59:59   
01:00:00   # DST ends and so we set the wall clock back 1 hour
01:00:01
...
01:59:58   # This local time occured for the second time
...

If you try to localize time that occurred twice (e.g. 01:59:58), then Pandas will get confused as to which time you're referring to - the first one (DST) or the second one (non-DST)?

Pandas can deal with such ambiguity in one of the following ways:

Value

Description

"infer"

Infer the DST transition from the sequence of time provided.

array of boolean

An array (e.g. lists, Numpy array) of booleans where:

  • True indicates DST time

  • False indicates non-DST time

"NaT"

Ambiguous times are converted into NaT (not-a-time).

"raise"

Ambiguous times will raise an error.

By default, ambiguous="raise".

7. dayfirst | boolean | optional

If True, then treat the first number as a day. For instance, "10/12/2020" will be parsed as December 10th, 2020. By default, dayfirst=False.

8. yearfirst | boolean | optional

If True, then treat the first number as a year. For instance, "20/12/10" will be parsed as December 10th, 2020. By default, yearfirst=False.

9. dtype | string or numpy.dtype or DatetimeTZDtype | optional

The dtype allowed is 'datetime64[ns]'. Instead of specifying tz, you can embed the timezone using dtype like so:

dtype="datetime64[ns, Europe/Paris]"

10. copy | boolean | optional

  • If True, then a new copy of data will be returned - modifying this copied data will not affect the original data, and vice versa.

  • If False, then a reference to data will be returned - modifying the return value will mutate the original data, and vice versa.

By default, copy=False.

11. name | string | optional

The name assigned to the resulting DatetimeIndex. By default, name=None.

Return Value

A DatetimeIndex object.

Examples

Basic usage

To create a DatetimeIndex using a list of date strings:

idx = pd.DatetimeIndex(["2020-12-25","2020-12-26"])
idx
DatetimeIndex(['2020-12-25', '2020-12-26'], dtype='datetime64[ns]', freq=None)

We can use this to initialise a DataFrame with DatetimeIndex:

df = pd.DataFrame({"A":["a","b"]}, index=idx)
df
A
2020-12-25 a
2020-12-26 b

Specifying a timezone

To specify a timezone, set tz like so:

pd.DatetimeIndex(["2020-12-25","2020-12-26"], tz="Asia/Tokyo")
DatetimeIndex(['2020-12-25 00:00:00+09:00', '2020-12-26 00:00:00+09:00'], dtype='datetime64[ns, Asia/Tokyo]', freq=None)

Here, the appended +09:00 means that the standard time in Tokyo is 9 hours ahead of UTC.

Dealing with ambiguous times

At 2019-10-27 3AM (Central European Time), the DST ended, which means that the wall clock was turned back one hour. Therefore, we have an ambiguous case here where local times like 2019-10-27 02:30:00 occurred twice.

raise

By default, ambiguous="raise", which means that an error is raised if there are ambiguous dates:

pd.DatetimeIndex(['2019-10-27 02:30:00',
'2019-10-27 02:00:00',
'2019-10-27 02:30:00',
'2019-10-27 03:00:00',
'2019-10-27 03:30:00'], tz="CET")
AmbiguousTimeError: Cannot infer dst time from 2019-10-27 02:30:00, try using the 'ambiguous' argument

infer

Setting ambiguous="infer" tells Pandas to try to resolve the ambiguity from the sequence of dates supplied:

pd.DatetimeIndex(['2019-10-27 02:30:00',
'2019-10-27 02:00:00',
'2019-10-27 02:30:00',
'2019-10-27 03:00:00',
'2019-10-27 03:30:00'], tz="CET", ambiguous="infer")
DatetimeIndex(['2019-10-27 02:30:00+02:00', '2019-10-27 02:00:00+01:00',
'2019-10-27 02:30:00+01:00', '2019-10-27 03:00:00+01:00',
'2019-10-27 03:30:00+01:00'],
dtype='datetime64[ns, CET]', freq=None)

Here, notice how Pandas has inferred that the first 02:30:00 is in DST, while the second 02:30:30 is non-DST.

Array of booleans

To explicitly tell Pandas whether a date should be parsed as DST or non-DST, pass an array of booleans where True indicates DST:

pd.DatetimeIndex(['2019-10-27 02:30:00',
'2019-10-27 02:50:00'], tz="CET", ambiguous=[True, False])
DatetimeIndex(['2019-10-27 02:30:00+02:00', '2019-10-27 02:50:00+01:00'], dtype='datetime64[ns, CET]', freq=None)

NaT

To set ambiguous dates as NaT (not-a-time):

pd.DatetimeIndex(['2019-10-27 02:30:00',
'2019-10-27 02:00:00',
'2019-10-27 02:30:00',
'2019-10-27 03:00:00',
'2019-10-27 03:30:00'], tz="CET", ambiguous="NaT")
DatetimeIndex(['NaT', 'NaT', 'NaT', '2019-10-27 03:00:00+01:00',
'2019-10-27 03:30:00+01:00'],
dtype='datetime64[ns, CET]', freq=None)
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!