Joining Data with pandas
Aaren Stubberfield
Instructor
.concat()
method can concatenate both vertical and horizontal.axis=0
, verticalinv_jan
(top)inv_feb
(middle)inv_mar
(bottom) iid cid invoice_date total
0 1 2 2009-01-01 1.98
1 2 4 2009-01-02 3.96
2 3 8 2009-01-03 5.94
iid cid invoice_date total
0 7 38 2009-02-01 1.98
1 8 40 2009-02-01 1.98
2 9 42 2009-02-02 3.96
iid cid invoice_date total
0 14 17 2009-03-04 1.98
1 15 19 2009-03-04 1.98
2 16 21 2009-03-05 3.96
pd.concat([inv_jan, inv_feb, inv_mar])
iid cid invoice_date total
0 1 2 2009-01-01 1.98
1 2 4 2009-01-02 3.96
2 3 8 2009-01-03 5.94
0 7 38 2009-02-01 1.98
1 8 40 2009-02-01 1.98
2 9 42 2009-02-02 3.96
0 14 17 2009-03-04 1.98
1 15 19 2009-03-04 1.98
2 16 21 2009-03-05 3.96
pd.concat([inv_jan, inv_feb, inv_mar],
ignore_index=True)
iid cid invoice_date total
0 1 2 2009-01-01 1.98
1 2 4 2009-01-02 3.96
2 3 8 2009-01-03 5.94
3 7 38 2009-02-01 1.98
4 8 40 2009-02-01 1.98
5 9 42 2009-02-02 3.96
6 14 17 2009-03-04 1.98
7 15 19 2009-03-04 1.98
8 16 21 2009-03-05 3.96
pd.concat([inv_jan, inv_feb, inv_mar],
ignore_index=False,
keys=['jan','feb','mar'])
iid cid invoice_date total
jan 0 1 2 2009-01-01 1.98
1 2 4 2009-01-02 3.96
2 3 8 2009-01-03 5.94
feb 0 7 38 2009-02-01 1.98
1 8 40 2009-02-01 1.98
2 9 42 2009-02-02 3.96
mar 0 14 17 2009-03-04 1.98
1 15 19 2009-03-04 1.98
2 16 21 2009-03-05 3.96
Table: inv_jan
iid cid invoice_date total
0 1 2 2009-01-01 1.98
1 2 4 2009-01-02 3.96
2 3 8 2009-01-03 5.94
Table: inv_feb
iid cid invoice_date total bill_ctry
0 7 38 2009-02-01 1.98 Germany
1 8 40 2009-02-01 1.98 France
2 9 42 2009-02-02 3.96 France
pd.concat([inv_jan, inv_feb],
sort=True)
bill_ctry cid iid invoice_date total
0 NaN 2 1 2009-01-01 1.98
1 NaN 4 2 2009-01-02 3.96
2 NaN 8 3 2009-01-03 5.94
0 Germany 38 7 2009-02-01 1.98
1 France 40 8 2009-02-01 1.98
2 France 42 9 2009-02-02 3.96
pd.concat([inv_jan, inv_feb],
join='inner')
iid cid invoice_date total
1 2 2009-01-01 1.98
2 4 2009-01-02 3.96
3 8 2009-01-03 5.94
7 38 2009-02-01 1.98
8 40 2009-02-01 1.98
9 42 2009-02-02 3.96
Joining Data with pandas