Using merge_asof()

Joining Data with pandas

Aaren Stubberfield

Instructor

Using merge_asof()

Joining two tables using merge_asof

  • Similar to a merge_ordered() left join
    • Similar features as merge_ordered()
  • Match on the nearest key column and not exact matches.
    • Merged "on" columns must be sorted.
Joining Data with pandas

Using merge_asof()

Joining two tables using merge_asof

  • Similar to a merge_ordered() left join
    • Similar features as merge_ordered()
  • Match on the nearest key column and not exact matches.
    • Merged "on" columns must be sorted.
Joining Data with pandas

Datasets

Table Name: visa

  date_time            close   
0 2017-11-17 16:00:00  110.32  
1 2017-11-17 17:00:00  110.24  
2 2017-11-17 18:00:00  110.065 
3 2017-11-17 19:00:00  110.04  
4 2017-11-17 20:00:00  110.0   
5 2017-11-17 21:00:00  109.9966
6 2017-11-17 22:00:00  109.82  

Table Name: ibm

   date_time            close   
0  2017-11-17 15:35:12  149.3   
1  2017-11-17 15:40:34  149.13  
2  2017-11-17 15:45:50  148.98  
3  2017-11-17 15:50:20  148.99  
4  2017-11-17 15:55:10  149.11  
5  2017-11-17 16:00:03  149.25  
6  2017-11-17 16:05:06  149.5175
7  2017-11-17 16:10:12  149.57  
8  2017-11-17 16:15:30  149.59  
9  2017-11-17 16:20:32  149.82  
10 2017-11-17 16:25:47  149.96
Joining Data with pandas

merge_asof() example

pd.merge_asof(visa, ibm, on='date_time', 
              suffixes=('_visa','_ibm'))
  date_time            close_visa  close_ibm
0 2017-11-17 16:00:00  110.32      149.11   
1 2017-11-17 17:00:00  110.24      149.83   
2 2017-11-17 18:00:00  110.065     149.59   
3 2017-11-17 19:00:00  110.04      149.505  
4 2017-11-17 20:00:00  110.0       149.42   
5 2017-11-17 21:00:00  109.9966    149.26   
6 2017-11-17 22:00:00  109.82      148.97

Table Name: ibm

   date_time            close   
0  2017-11-17 15:35:12  149.3   
1  2017-11-17 15:40:34  149.13  
2  2017-11-17 15:45:50  148.98  
3  2017-11-17 15:50:20  148.99  
4  2017-11-17 15:55:10  149.11  
5  2017-11-17 16:00:03  149.25  
6  2017-11-17 16:05:06  149.5175
7  2017-11-17 16:10:12  149.57  
8  2017-11-17 16:15:30  149.59  
9  2017-11-17 16:20:32  149.82  
10 2017-11-17 16:25:47  149.96
Joining Data with pandas

merge_asof() example with direction

pd.merge_asof(visa, ibm, on=['date_time'], 
              suffixes=('_visa','_ibm'), 
              direction='forward')
  date_time            close_visa  close_ibm
0 2017-11-17 16:00:00  110.32      149.25   
1 2017-11-17 17:00:00  110.24      149.6184 
2 2017-11-17 18:00:00  110.065     149.59   
3 2017-11-17 19:00:00  110.04      149.505  
4 2017-11-17 20:00:00  110.0       149.42   
5 2017-11-17 21:00:00  109.9966    149.26   
6 2017-11-17 22:00:00  109.82      148.97   

Table Name: ibm

   date_time            close   
0  2017-11-17 15:35:12  149.3   
1  2017-11-17 15:40:34  149.13  
2  2017-11-17 15:45:50  148.98  
3  2017-11-17 15:50:20  148.99  
4  2017-11-17 15:55:10  149.11  
5  2017-11-17 16:00:03  149.25  
6  2017-11-17 16:05:06  149.5175
7  2017-11-17 16:10:12  149.57  
8  2017-11-17 16:15:30  149.59  
9  2017-11-17 16:20:32  149.82  
10 2017-11-17 16:25:47  149.96
Joining Data with pandas

When to use merge_asof()

  • Data sampled from a process
  • Developing a training set (no data leakage)
Joining Data with pandas

Let's practice!

Joining Data with pandas

Preparing Video For Download...