Joining data in real life

Pandas Joins for Spreadsheet Users

John Miller

Principal Data Scientist

Mixing columns and indexes

index-column-mix

teams.merge(positions, left_on='player_id', right_index=True)

or

positions.merge(teams, left_index=True, right_on='player_id')
Pandas Joins for Spreadsheet Users

Working with overlapping column names

overlapping columns

current.merge(drafted, on='name', suffixes=('_current', '_drafted'))
Pandas Joins for Spreadsheet Users

Identifying rows by table

table indicator

current.merge(drafted, how='outer', on='name', suffixes=('_current', '_drafted'),
               indicator=True)
Pandas Joins for Spreadsheet Users

Sorting rows by key

sorted rows

current.merge(drafted, on='name', suffixes=('_current', '_drafted'),
               sort=True)
Pandas Joins for Spreadsheet Users

Let's practice!

Pandas Joins for Spreadsheet Users

Preparing Video For Download...