Joining Data with pandas
Aaren Stubberfield
Instructor
One-To-One = Every row in the left table is related to only one row in the right table
ward alderman address zip
0 1 Proco "Joe" ... 2058 NORTH W... 60647
1 2 Brian Hopkins 1400 NORTH ... 60622
2 3 Pat Dowell 5046 SOUTH S... 60609
3 4 William D. B... 435 EAST 35T... 60616
4 5 Leslie A. Ha... 2325 EAST 71... 60649
ward pop_2000 pop_2010 change address zip
0 1 52951 56149 6% 2765 WEST SA... 60647
1 2 54361 55805 3% WM WASTE MAN... 60622
2 3 40385 53039 31% 17 EAST 38TH... 60653
3 4 51953 54589 5% 31ST ST HARB... 60653
4 5 55302 51455 -7% JACKSON PARK... 60637
One-To-Many = Every row in left table is related to one or more rows in the right table
licenses = pd.read_csv('Business_Licenses.csv')
print(licenses.head())
print(licenses.shape)
account ward aid business address zip
0 307071 3 743 REGGIE'S BAR... 2105 S STATE ST 60616
1 10 10 829 HONEYBEERS 13200 S HOUS... 60633
2 10002 14 775 CELINA DELI 5089 S ARCHE... 60632
3 10005 12 nan KRAFT FOODS ... 2005 W 43RD ST 60609
4 10044 44 638 NEYBOUR'S TA... 3651 N SOUTH... 60613
(10000, 6)
ward alderman address zip
0 1 Proco "Joe" ... 2058 NORTH W... 60647
1 2 Brian Hopkins 1400 NORTH ... 60622
2 3 Pat Dowell 5046 SOUTH S... 60609
3 4 William D. B... 435 EAST 35T... 60616
4 5 Leslie A. Ha... 2325 EAST 71... 60649
account ward aid business address zip
0 307071 3 743 REGGIE'S BAR... 2105 S STATE ST 60616
1 10 10 829 HONEYBEERS 13200 S HOUS... 60633
2 10002 14 775 CELINA DELI 5089 S ARCHE... 60632
3 10005 12 nan KRAFT FOODS ... 2005 W 43RD ST 60609
4 10044 44 638 NEYBOUR'S TA... 3651 N SOUTH... 60613
ward_licenses = wards.merge(licenses, on='ward', suffixes=('_ward','_lic'))
print(ward_licenses.head())
ward alderman address_ward zip_ward account aid business address_lic zip_lic
0 1 Proco "Joe" ... 2058 NORTH W... 60647 12024 nan DIGILOG ELEC... 1038 N ASHLA... 60622
1 1 Proco "Joe" ... 2058 NORTH W... 60647 14446 743 EMPTY BOTTLE... 1035 N WESTE... 60622
2 1 Proco "Joe" ... 2058 NORTH W... 60647 14624 775 LITTLE MEL'S... 2205 N CALIF... 60647
3 1 Proco "Joe" ... 2058 NORTH W... 60647 14987 nan MR. BROWN'S ... 2301 W CHICA... 60622
4 1 Proco "Joe" ... 2058 NORTH W... 60647 15642 814 Beat Kitchen 2000-2100 W ... 60622
print(wards.shape)
(50, 4)
print(ward_licenses.shape)
(10000, 9)
Joining Data with pandas