Concatenate DataFrames together vertically

Joining Data with pandas

Aaren Stubberfield

Instructor

Concatenate two tables vertically

Two tables being placed together vertically

  • pandas .concat() method can concatenate both vertical and horizontal.
    • axis=0, vertical
Joining Data with pandas

Basic concatenation

  • 3 different tables
  • Same column names
  • Table variable names:
    • inv_jan (top)
    • inv_feb (middle)
    • inv_mar (bottom)
  iid  cid  invoice_date  total
0 1    2    2009-01-01    1.98 
1 2    4    2009-01-02    3.96 
2 3    8    2009-01-03    5.94
  iid  cid  invoice_date  total
0 7    38   2009-02-01    1.98 
1 8    40   2009-02-01    1.98 
2 9    42   2009-02-02    3.96 
  iid  cid  invoice_date  total
0 14   17   2009-03-04    1.98 
1 15   19   2009-03-04    1.98 
2 16   21   2009-03-05    3.96 
Joining Data with pandas

Basic concatenation

pd.concat([inv_jan, inv_feb, inv_mar])
   iid  cid  invoice_date  total
0  1    2    2009-01-01    1.98 
1  2    4    2009-01-02    3.96 
2  3    8    2009-01-03    5.94 
0  7    38   2009-02-01    1.98 
1  8    40   2009-02-01    1.98 
2  9    42   2009-02-02    3.96 
0  14   17   2009-03-04    1.98 
1  15   19   2009-03-04    1.98 
2  16   21   2009-03-05    3.96
Joining Data with pandas

Ignoring the index

pd.concat([inv_jan, inv_feb, inv_mar],
          ignore_index=True)
   iid  cid  invoice_date  total
0  1    2    2009-01-01    1.98 
1  2    4    2009-01-02    3.96 
2  3    8    2009-01-03    5.94 
3  7    38   2009-02-01    1.98 
4  8    40   2009-02-01    1.98 
5  9    42   2009-02-02    3.96 
6  14   17   2009-03-04    1.98 
7  15   19   2009-03-04    1.98 
8  16   21   2009-03-05    3.96
Joining Data with pandas

Setting labels to original tables

pd.concat([inv_jan, inv_feb, inv_mar],
          ignore_index=False, 
          keys=['jan','feb','mar'])
       iid  cid  invoice_date  total
jan 0  1    2    2009-01-01    1.98 
    1  2    4    2009-01-02    3.96 
    2  3    8    2009-01-03    5.94 
feb 0  7    38   2009-02-01    1.98 
    1  8    40   2009-02-01    1.98 
    2  9    42   2009-02-02    3.96 
mar 0  14   17   2009-03-04    1.98 
    1  15   19   2009-03-04    1.98 
    2  16   21   2009-03-05    3.96
Joining Data with pandas

Concatenate tables with different column names

Table: inv_jan

  iid  cid  invoice_date  total
0 1    2    2009-01-01    1.98 
1 2    4    2009-01-02    3.96 
2 3    8    2009-01-03    5.94

Table: inv_feb

  iid  cid  invoice_date  total  bill_ctry
0 7    38   2009-02-01    1.98   Germany  
1 8    40   2009-02-01    1.98   France   
2 9    42   2009-02-02    3.96   France   
Joining Data with pandas

Concatenate tables with different column names

pd.concat([inv_jan, inv_feb],
          sort=True)
  bill_ctry  cid  iid  invoice_date  total
0 NaN        2    1    2009-01-01    1.98 
1 NaN        4    2    2009-01-02    3.96 
2 NaN        8    3    2009-01-03    5.94 
0 Germany    38   7    2009-02-01    1.98 
1 France     40   8    2009-02-01    1.98 
2 France     42   9    2009-02-02    3.96
Joining Data with pandas

Concatenate tables with different column names

pd.concat([inv_jan, inv_feb],
          join='inner')
iid  cid  invoice_date  total
1    2    2009-01-01    1.98 
2    4    2009-01-02    3.96 
3    8    2009-01-03    5.94 
7    38   2009-02-01    1.98 
8    40   2009-02-01    1.98 
9    42   2009-02-02    3.96
Joining Data with pandas

Let's practice!

Joining Data with pandas

Preparing Video For Download...