Creating conditional expressions

Data Transformation with Polars

Liam Brannigan

Data Scientist & Polars Contributor

Categorizing by rating quality

shape: (5, 5)
| business         | location        | type       | rating | capacity |
| ---              | ---             | ---        | ---    | ---      |
| str              | str             | str        | i64    | i64      |
|------------------|-----------------|------------|--------|----------|
| 7burgers         | Wakey Wakey     | restaurant | 5      | 55       |
| Bang Bang Burger | Forest Rd.      | restaurant | 4      | 55       |
| Costa Coffee     | City Point      | café       | 5      | 41       |
| Costa Coffee     | The Moorgate    | takeaway   | 3      | 0        |
| The Queens Head  | Denman St.      | bar        | 5      | 187      |
  • Task: identify highly-rated restaurants
Data Transformation with Polars

Categorizing by rating quality

ratings.with_columns(


)
Data Transformation with Polars

Categorizing by rating quality

ratings.with_columns(
    pl.when(                     )

)
Data Transformation with Polars

Categorizing by rating quality

ratings.with_columns(
    pl.when(pl.col("rating") == 5)

)
Data Transformation with Polars

Categorizing by rating quality

ratings.with_columns(
    pl.when(pl.col("rating") == 5).then(pl.lit("Highly Rated"))

)

$$

$$

  • Use pl.lit() to pass a literal value, not a column name
Data Transformation with Polars

Categorizing by rating quality

ratings.with_columns(
    pl.when(pl.col("rating") == 5).then(pl.lit("Highly Rated"))
    .otherwise(pl.lit("Needs Improvement"))
)
Data Transformation with Polars

Categorizing by rating quality

ratings.with_columns(
    pl.when(pl.col("rating") == 5).then(pl.lit("Highly Rated"))
    .otherwise(pl.lit("Needs Improvement")).alias("quality")
)
Data Transformation with Polars

Categorizing by rating quality

ratings.with_columns(
    pl.when(pl.col("rating") == 5).then(pl.lit("Highly Rated"))
    .otherwise(pl.lit("Needs Improvement")).alias("quality")
)
shape: (5, 6)
| business         | location        | type       | rating | capacity | quality            |
| ---              | ---             | ---        | ---    | ---      | ---                |
| str              | str             | str        | i64    | i64      | str                |
|------------------|-----------------|------------|--------|----------|--------------------|
| 7burgers         | Wakey Wakey     | restaurant | 5      | 55       | Highly Rated       |
| Bang Bang Burger | Forest Rd.      | restaurant | 4      | 55       | Needs Improvement  |
| ...              | ...             | ...        | ...    | ...      | ...                |
Data Transformation with Polars

Categorizing by venue size

shape: (5, 5)
| business         | location        | type       | rating | capacity |
| ---              | ---             | ---        | ---    | ---      |
| str              | str             | str        | i64    | i64      |
|------------------|-----------------|------------|--------|----------|
| 7burgers         | Wakey Wakey     | restaurant | 5      | 55       |
| Bang Bang Burger | Forest Rd.      | restaurant | 4      | 55       |
| Costa Coffee     | City Point      | café       | 5      | 41       |
| Costa Coffee     | The Moorgate    | takeaway   | 3      | 0        |
| The Queens Head  | Denman St.      | bar        | 5      | 187      |
  • Task: filter restaurants by capacity
Data Transformation with Polars

Categorizing by venue size

ratings.with_columns(



)
Data Transformation with Polars

Categorizing by venue size

ratings.with_columns(
    pl.when(pl.col("capacity") > 100)


)
Data Transformation with Polars

Categorizing by venue size

ratings.with_columns(
    pl.when(pl.col("capacity") > 100).then(pl.lit("Large"))


)
Data Transformation with Polars

Categorizing by venue size

ratings.with_columns(
    pl.when(pl.col("capacity") > 100).then(pl.lit("Large"))
    .when(pl.col("capacity") >= 20)

)
Data Transformation with Polars

Categorizing by venue size

ratings.with_columns(
    pl.when(pl.col("capacity") > 100).then(pl.lit("Large"))
    .when(pl.col("capacity") >= 20).then(pl.lit("Medium"))

)
Data Transformation with Polars

Categorizing by venue size

ratings.with_columns(
    pl.when(pl.col("capacity") > 100).then(pl.lit("Large"))
    .when(pl.col("capacity") >= 20).then(pl.lit("Medium"))
    .otherwise(pl.lit("Small"))
)
Data Transformation with Polars

Categorizing by venue size

ratings.with_columns(
    pl.when(pl.col("capacity") > 100).then(pl.lit("Large"))
    .when(pl.col("capacity") >= 20).then(pl.lit("Medium"))
    .otherwise(pl.lit("Small")).alias("venue_size")
)
shape: (3, 6)
| business         | location        | type       | rating | capacity | venue_size |
| ---              | ---             | ---        | ---    | ---      | ---        |
| str              | str             | str        | i64    | i64      | str        |
|------------------|-----------------|------------|--------|----------|------------|
| 7burgers         | Wakey Wakey     | restaurant | 5      | 55       | Medium     |
| Costa Coffee     | The Moorgate    | takeaway   | 3      | 0        | Small      |
| The Queens Head  | Denman St.      | bar        | 5      | 187      | Large      |
Data Transformation with Polars

Standardizing business types

shape: (5, 5)
| business         | location        | type       | rating | capacity |
| ---              | ---             | ---        | ---    | ---      |
| str              | str             | str        | i64    | i64      |
|------------------|-----------------|------------|--------|----------|
| 7burgers         | Wakey Wakey     | restaurant | 5      | 55       |
| Bang Bang Burger | Forest Rd.      | restaurant | 4      | 55       |
| Costa Coffee     | City Point      | café       | 5      | 41       |
| Costa Coffee     | The Moorgate    | takeaway   | 3      | 0        |
| The Queens Head  | Denman St.      | bar        | 5      | 187      |
  • Task: standardize the type column
Data Transformation with Polars

Standardizing business types

ratings.with_columns(
    pl.col("type")
)
Data Transformation with Polars

Standardizing business types

ratings.with_columns(
    pl.col("type").replace(                    )
)
Data Transformation with Polars

Standardizing business types

ratings.with_columns(
    pl.col("type").replace("café", "restaurant")
)
shape: (5, 5)
| business         | location        | type       | rating | capacity |
| ---              | ---             | ---        | ---    | ---      |
| str              | str             | str        | i64    | i64      |
|------------------|-----------------|------------|--------|----------|
| 7burgers         | Wakey Wakey     | restaurant | 5      | 55       |
| Bang Bang Burger | Forest Rd.      | restaurant | 4      | 55       |
| Costa Coffee     | City Point      | restaurant | 5      | 41       |
| Costa Coffee     | The Moorgate    | takeaway   | 3      | 0        |
| The Queens Head  | Denman St.      | bar        | 5      | 187      |
Data Transformation with Polars

Let's practice!

Data Transformation with Polars

Preparing Video For Download...