Joining Data with pandas
Aaren Stubberfield
Instructor
Mutating joins:
Filtering joins:
gid name
0 1 Rock
1 2 Jazz
2 3 Metal
3 4 Alternative ...
4 5 Rock And Roll
tid name aid mtid gid composer u_price
0 1 For Those Ab... 1 1 1 Angus Young,... 0.99
1 2 Balls to the... 2 2 1 nan 0.99
2 3 Fast As a Shark 3 2 1 F. Baltes, S... 0.99
3 4 Restless and... 3 2 1 F. Baltes, R... 0.99
4 5 Princess of ... 3 2 1 Deaffy & R.A... 0.99
genres_tracks = genres.merge(top_tracks, on='gid')
print(genres_tracks.head())
gid name_x tid name_y aid mtid composer u_price
0 1 Rock 2260 Don't Stop M... 185 1 Mercury, Fre... 0.99
1 1 Rock 2933 Mysterious Ways 232 1 U2 0.99
2 1 Rock 2618 Speed Of Light 212 1 Billy Duffy/... 0.99
3 1 Rock 2998 When Love Co... 237 1 Bono/Clayton... 0.99
4 1 Rock 685 Who'll Stop ... 54 1 J. C. Fogerty 0.99
genres['gid'].isin(genres_tracks['gid'])
genres['gid'].isin(genres_tracks['gid'])
0 True
1 True
2 True
3 True
4 False
Name: gid, dtype: bool
genres_tracks = genres.merge(top_tracks, on='gid')
top_genres = genres[genres['gid'].isin(genres_tracks['gid'])]
print(top_genres.head())
gid name
0 1 Rock
1 2 Jazz
2 3 Metal
3 4 Alternative & Punk
4 6 Blues
genres_tracks = genres.merge(top_tracks, on='gid', how='left', indicator=True)
print(genres_tracks.head())
gid name_x tid name_y aid mtid composer u_price _merge
0 1 Rock 2260.0 Don't Stop M... 185.0 1.0 Mercury, Fre... 0.99 both
1 1 Rock 2933.0 Mysterious Ways 232.0 1.0 U2 0.99 both
2 1 Rock 2618.0 Speed Of Light 212.0 1.0 Billy Duffy/... 0.99 both
3 1 Rock 2998.0 When Love Co... 237.0 1.0 Bono/Clayton... 0.99 both
4 5 Rock And Roll NaN NaN NaN NaN NaN NaN left_only
gid_list = genres_tracks.loc[genres_tracks['_merge'] == 'left_only', 'gid']
print(gid_list.head())
23 5
34 9
36 11
37 12
38 13
Name: gid, dtype: int64
genres_tracks = genres.merge(top_tracks, on='gid', how='left', indicator=True) gid_list = genres_tracks.loc[genres_tracks['_merge'] == 'left_only','gid'] non_top_genres = genres[genres['gid'].isin(gid_list)]
print(non_top_genres.head())
gid name
0 5 Rock And Roll
1 9 Pop
2 11 Bossa Nova
3 12 Easy Listening
4 13 Heavy Metal
Joining Data with pandas