A closer look at one-to-one joins

Pandas Joins for Spreadsheet Users

John Miller

Principal Data Scientist

Basics of pandas.merge()

$$

  • Used with complementary data
  • pd.merge(left, right)
  • Joins on one or more columns
  • Similar to VLOOKUP

merged wood

Pandas Joins for Spreadsheet Users

Left merges

left merge

$$

pd.merge(df_left, df_right, 
         on='GameKey', how='left')
  • List data frame names for "df_left" and "df_right" placeholders
  • Resulting frame matches rows in left frame
Pandas Joins for Spreadsheet Users

Right merges

right merge

$$

pd.merge(df_left, df_right, 
         on='GameKey', how='right')
  • Same concept as left merge
  • Resulting frame rows match right frame
Pandas Joins for Spreadsheet Users

A framework for joins

$$

After viewing and understanding the data:

  • Determine the relationship
  • Check for unique values in key column
  • Write merge statement and execute

squares

Pandas Joins for Spreadsheet Users

Joining on two keys

$$ multikey merge

$$

  • Determine the relationship
  • Check for unique values in key column
  • Write merge statement and execute

$$

 pd.merge(df_left, df_right,
           on=['GameKey', 'PlayId'])
Pandas Joins for Spreadsheet Users

Let's practice!

Pandas Joins for Spreadsheet Users

Preparing Video For Download...