Working with Dates and Times in Python
Max Shron
Data Scientist & Author
rides['Duration'].dt.total_seconds().min()
-3346.0
rides['Start date'].head(3)
0 2017-10-01 15:23:25
1 2017-10-01 15:42:57
2 2017-10-02 06:37:10
Name: Start date, dtype: datetime64[ns]
rides['Start date'].head(3)\
.dt.tz_localize('America/New_York')
0 2017-10-01 15:23:25-04:00
1 2017-10-01 15:42:57-04:00
2 2017-10-02 06:37:10-04:00
Name: Start date, dtype: datetime64[ns, America/New_York]
# Try to set a timezone...
rides['Start date'] = rides['Start date']\
.dt.tz_localize('America/New_York')
pytz.exceptions.AmbiguousTimeError: Cannot infer dst time from '2017-11-05 01:56:50',
try using the 'ambiguous' argument
# Handle ambiguous datetimes
rides['Start date'] = rides['Start date']\
.dt.tz_localize('America/New_York', ambiguous='NaT')
rides['End date'] = rides['End date']\
.dt.tz_localize('America/New_York', ambiguous='NaT')
# Re-calculate duration, ignoring bad row rides['Duration'] = rides['End date'] - rides['Start date']
# Find the minimum again rides['Duration'].dt.total_seconds().min()
116.0
# Look at problematic row
rides.iloc[129]
Duration NaT
Start date NaT
End date NaT
Start station 6th & H St NE
End station 3rd & M St NE
Bike number W20529
Member type Member
Name: 129, dtype: object
# Year of first three rows
rides['Start date']\
.head(3)\
.dt.year
0 2017
1 2017
2 2017
Name: Start date, dtype: int64
# See weekdays for first three rides
rides['Start date']\
.head(3)\
.dt.day_name()
0 Sunday
1 Sunday
2 Monday
Name: Start date, dtype: object
# Shift the indexes forward one, padding with NaT
rides['End date'].shift(1).head(3)
0 NaT
1 2017-10-01 15:26:26-04:00
2 2017-10-01 17:49:59-04:00
Name: End date, dtype: datetime64[ns, America/New_York]
Working with Dates and Times in Python