Concatenating DataFrames

Data Transformation with Polars

Liam Brannigan

Data Scientist & Polars Contributor

Back to the restaurant app

shape: (4, 4)
| business         | location    | review | price |
| ---              | ---         | ---    | ---   |
| str              | str         | f64    | i64   |
|------------------|-------------|--------|-------|
| 7burgers         | Wakey Wakey | 4.2    | 15    |
| Bang Bang Burger | Forest Rd.  | 3.8    | 12    |
| Costa Coffee     | City Point  | 4.5    | 8     |
| The Queens Head  | Denman St.  | 4.7    | 25    |
Data Transformation with Polars

Concatenation

Diagram showing three DataFrames arranged vertically

Data Transformation with Polars

Concatenation

Diagram showing three DataFrames arranged vertically combining into one DataFrame

Data Transformation with Polars

Our central London data

import polars as pl

central = pl.read_csv("restaurants_central.csv")
shape: (4, 4)
| business         | location    | review | price |
| ---              | ---         | ---    | ---   |
| str              | str         | f64    | i64   |
|------------------|-------------|--------|-------|
| 7burgers         | Wakey Wakey | 4.2    | 15    |
| Bang Bang Burger | Forest Rd.  | 3.8    | 12    |
| Costa Coffee     | City Point  | 4.5    | 8     |
| The Queens Head  | Denman St.  | 4.7    | 25    |
Data Transformation with Polars

Our South London data

south = pl.read_csv("restaurants_south.csv")
shape: (3, 4)
| business       | location       | review | price |
| ---            | ---            | ---    | ---   |
| str            | str            | f64    | i64   |
|----------------|----------------|--------|-------|
| Pizzeria Bella | Bermondsey St. | 3.5    | 18    |
| Costa Coffee   | Waterloo       | 4.1    | 8     |
| Nando's        | Brixton Rd.    | 4.0    | 20    |

$$

  • Vertical concatenation - putting DataFrames on top of one another
Data Transformation with Polars

Vertical concatenation

reviews = pl.concat(                                )
Data Transformation with Polars

Vertical concatenation

reviews = pl.concat([central, south]                )
Data Transformation with Polars

Vertical concatenation

reviews = pl.concat([central, south], how="vertical")
shape: (7, 4)
| business         | location       | review | price |
| ---              | ---            | ---    | ---   |
| str              | str            | f64    | i64   |
|------------------|----------------|--------|-------|
| 7burgers         | Wakey Wakey    | 4.2    | 15    |
| Bang Bang Burger | Forest Rd.     | 3.8    | 12    |
| Costa Coffee     | City Point     | 4.5    | 8     |
| The Queens Head  | Denman St.     | 4.7    | 25    |
| Pizzeria Bella   | Bermondsey St. | 3.5    | 18    |
| Costa Coffee     | Waterloo       | 4.1    | 8     |
| Nando's          | Brixton Rd.    | 4.0    | 20    |
Data Transformation with Polars

Vertical concatenation

reviews = pl.read_csv("restaurants*.csv")
shape: (7, 4)
| business         | location       | review | price |
| ---              | ---            | ---    | ---   |
| str              | str            | f64    | i64   |
|------------------|----------------|--------|-------|
| 7burgers         | Wakey Wakey    | 4.2    | 15    |
| Bang Bang Burger | Forest Rd.     | 3.8    | 12    |
| Costa Coffee     | City Point     | 4.5    | 8     |
| The Queens Head  | Denman St.     | 4.7    | 25    |
| Pizzeria Bella   | Bermondsey St. | 3.5    | 18    |
| Costa Coffee     | Waterloo       | 4.1    | 8     |
| Nando's          | Brixton Rd.    | 4.0    | 20    |
Data Transformation with Polars

A third batch with missing data

north = pl.read_csv("north.csv")
shape: (2, 3)
| business      | location      | review |
| ---           | ---           | ---    |
| str           | str           | f64    |
|-----------    |---------------|--------|
| Pig & Butcher | Liverpool Rd. | 3.2    |
| The Castle    | Angle         | 4.4    |
Data Transformation with Polars

Vertical concat fails

pl.concat([central, south, north], how="vertical")
ShapeError: unable to append to a DataFrame of width 4 with a DataFrame of width 3
Data Transformation with Polars

Diagonal concatenation

pl.concat([central, south, north], how="diagonal")
Data Transformation with Polars

Diagonal concatenation

pl.concat([central, south, north], how="diagonal")
shape: (9, 4)
| business         | location            | review | price |
| ---              | ---                 | ---    | ---   |
| str              | str                 | f64    | i64   |
|------------------|---------------------|--------|-------|
| 7burgers         | Wakey Wakey         | 4.2    | 15    |
| ...              | ...                 | ...    | ...   |
| Nando's          | Brixton Rd.         | 4.0    | 20    |
| Pig & Butcher    | Liverpool Rd.       | 3.2    | null  |
| The Castle       | Angle               | 4.4    | null  |
Data Transformation with Polars

Horizontal concatenation

cuisine = pl.read_csv("cuisine.csv")
shape: (7, 1)
| cuisine  |
| ---      |
| str      |
|----------|
| burgers  |
| burgers  |
| coffee   |
| ...      |
| chicken  |
  • Horizontal concatenation - setting DataFrames side-by-side
Data Transformation with Polars

Horizontal concatenation

pl.concat([reviews, cuisine], how="horizontal")
shape: (7, 5)
| business         | location       | review | price | cuisine  |
| ---              | ---            | ---    | ---   | ---      |
| str              | str            | f64    | i64   | str      |
|------------------|----------------|--------|-------|----------|
| 7burgers         | Wakey Wakey    | 4.2    | 15    | burgers  |
| Bang Bang Burger | Forest Rd.     | 3.8    | 12    | burgers  |
| Costa Coffee     | City Point     | 4.5    | 8     | coffee   |
| The Queens Head  | Denman St.     | 4.7    | 25    | pub_food |
| Pizzeria Bella   | Bermondsey St. | 3.5    | 18    | pizza    |
| Costa Coffee     | Waterloo       | 4.1    | 8     | coffee   |
| Nando's          | Brixton Rd.    | 4.0    | 20    | chicken  |
Data Transformation with Polars

Appending with extend

new_listing = pl.DataFrame({
    "business": ["Wagamama"],
    "location": ["Soho"],
    "review": [4.3],
    "price": [16]
})
shape: (1, 4)
| business         | location     | review | price |
| ---              | ---          | ---    | ---   |
| str              | str          | f64    | i64   |
|------------------|--------------|--------|-------|
| Wagamama         | Soho         | 4.3    | 16    |
Data Transformation with Polars

Appending with extend

reviews.extend(new_listing)
shape: (8, 4)
| business         | location       | review | price |
| ---              | ---            | ---    | ---   |
| str              | str            | f64    | i64   |
|------------------|----------------|--------|-------|
| 7burgers         | Wakey Wakey    | 4.2    | 15    |
| ...              | ...            | ...    | ...   |
| Nando's          | Brixton Rd.    | 4.0    | 20    |
| Wagamama         | Soho           | 4.3    | 16    |
Data Transformation with Polars

Let's practice!

Data Transformation with Polars

Preparing Video For Download...