Working with time data

Pandas Joins for Spreadsheet Users

John Miller

Principal Data Scientist

Joining with .merge_ordered()

time-series merge no result

pd.merge_ordered(cleveland, dallas, on='Game_Date', 
                suffixes=['_CLE', '_DAL'])
Pandas Joins for Spreadsheet Users

Joining with .merge_ordered()

time-series merge

pd.merge_ordered(cleveland, dallas, on='Game_Date', 
                suffixes=['_CLE', '_DAL'])
Pandas Joins for Spreadsheet Users

Interpolating data

interpolate on merge

pd.merge_ordered(tc2, td2, on='Game_Date', 
                suffixes=['_CLE', '_DAL'], fill_method='ffill')
Pandas Joins for Spreadsheet Users

Interpolating data

fgf

pd.merge_ordered(tc2, td2, on='Game_Date', 
                suffixes=['_CLE', '_DAL'], fill_method='ffill')
Pandas Joins for Spreadsheet Users

Merging to nearest date-times

  • pandas.merge_asof()
  • matches on nearest date
  • similar to VLOOKUP(range_lookup=TRUE)
pd.merge_asof(left_df, right_df,
              direction='backward')

Directions

  • "backward": closest date that is earlier

  • "forward": closest date equal or later

  • "nearest": closest date regardless

Pandas Joins for Spreadsheet Users

Merge_asof Example

merge_asof

pd.merge_asof(temps, impacts,
              left_on='Game_Date', right_index=True, direction='backward')
Pandas Joins for Spreadsheet Users

Let's practice!

Pandas Joins for Spreadsheet Users

Preparing Video For Download...