Filtering rows

Introduction to Polars

Liam Brannigan

Data Scientist & Polars Contributor

Why filter a DataFrame?

rentals
shape: (49, 8)
| name      | type    | price | bedrooms | doubles | singles | review | beach |
| ---       | ---     | ---   | ---      | ---     | ---     | ---    | ---   |
| str       | str     | i64   | i64      | i64     | i64     | f64    | bool  |
|-----------|---------|-------|----------|---------|---------|--------|-------|
| Waves     | Cottage | 540   | 4        | 1       | 2       | 8.9    | false |
| Seashells | Cottage | 540   | 4        | 2       | 2       | 8.7    | true  |
| ...       | ...     | ...   | ...      | ...     | ...     | ...    | ...   |
Introduction to Polars

Introducing filter

rentals.filter(
  # Predicate
)
  • Predicate: evaluating a condition to be True or False

A green tick for a passing test

Introduction to Polars

Introducing filter

rentals.filter(
  # Predicate
)
  • Predicate: evaluating a condition to be True or False

A green tick for a passing test

Introduction to Polars

Adding a predicate

rentals.filter(
    pl.col("price") < 500
)
shape: (6, 8)
| name          | type    | price | bedrooms | doubles | singles | review | beach |
| ---           | ---     | ---   | ---      | ---     | ---     | ---    | ---   |
| str           | str     | i64   | i64      | i64     | i64     | f64    | bool  |
|---------------|---------|-------|----------|---------|---------|--------|-------|
| St Ives Bay   | Hotel   | 470   | 3        | 1       | 3       | 7.7    | true  |
| Perran View   | Hotel   | 341   | 3        | 1       | 4       | 8.2    | false |
| ...           | ...     | ...   | ...      | ...     | ...     | ...    | ...   |
Introduction to Polars

Combining conditions with AND

rentals.filter(
   # Properties under 500 by the beach
)
Introduction to Polars

Combining conditions with AND

rentals.filter(
     pl.col("price") < 500
)
Introduction to Polars

Combining conditions with AND

rentals.filter(
    (pl.col("price") < 500)
)
Introduction to Polars

Combining conditions with AND

rentals.filter(
    (pl.col("price") < 500) &
)
Introduction to Polars

Combining conditions with AND

rentals.filter(
    (pl.col("price") < 500) & (pl.col("beach") == True)
)
shape: (6, 8)
| name                | type    | price | ... | doubles | singles | review | beach |
| ---                 | ---     | ---   | ... | ---     | ---     | ---    | ---   |
| str                 | str     | i64   | ... | i64     | i64     | f64    | bool  |
|---------------------|---------|-------|---- |---------|---------|--------|-------|
| St Ives Bay         | Hotel   | 470   | ... | 1       | 3       | 7.7    | true  |
| Porth Retreat       | Hotel   | 249   | ... | 1       | 4       | null   | true  |
| Porth Caravan       | Caravan | 380   | ... | 1       | 4       | 7.2    | true  |
| ...                 | ...     | ...   | ... | ...     | ...     | ...    | ...   |
Introduction to Polars

Combining conditions with OR

rentals.filter(

)
Introduction to Polars

Combining conditions with OR

rentals.filter(
    (pl.col("price") < 500)
)
Introduction to Polars

Combining conditions with OR

rentals.filter(
    (pl.col("price") < 500) | 
)
Introduction to Polars

Combining conditions with OR

rentals.filter(
    (pl.col("price") < 500) | (pl.col("review") > 9.5)
)
shape: (18, 8)
| name              | type      | price | ... | doubles | singles | review | beach |
| ---               | ---       | ---   | ... | ---     | ---     | ---    | ---   |
| str               | str       | i64   | ... | i64     | i64     | f64    | bool  |
|-------------------|-----------|-------|-----|---------|---------|--------|-------|
| Bright House      | Cottage   | 956   | ... | 3       | null    | 9.9    | true  |
| Trewhiddle Villa  | Villa     | 1077  | ... | 3       | null    | 9.8    | false |
| ...               | ...       | ...   | ... | ...     | ...     | ...    | ...   |
Introduction to Polars

Filtering based on a list

rentals.filter(
    pl.col("type").is_in(["Cottage", "Villa"])
)
shape: (34, 8)
| name          | type    | price | bedrooms | doubles | singles | review | beach |
| ---           | ---     | ---   | ---      | ---     | ---     | ---    | ---   |
| str           | str     | i64   | i64      | i64     | i64     | f64    | bool  |
|---------------|---------|-------|----------|---------|---------|--------|-------|
| Bright House  | Cottage | 956   | 4        | 3       | null    | 9.9    | true  |
| Bright House  | Cottage | 1050  | 4        | 3       | null    | 9.9    | true  |
| ...           | ...     | ...   | ...      | ...     | ...     | ...    | ...   |
Introduction to Polars

Negating a predicate

rentals.filter(
    pl.col("type").is_in(["Cottage", "Villa"]).not_()
)
shape: (14, 8)
| name            | type      | price | ... | doubles | singles | review | beach |
| ---             | ---       | ---   | ... | ---     | ---     | ---    | ---   |
| str             | str       | i64   | ... | i64     | i64     | f64    | bool  |
|-----------------|-----------|-------| ... |---------|---------|--------|-------|
| Tregenna House  | Hotel     | 2411  | ... | 1       | 2       | 8.7    | true  |
| Tregenna House  | Hotel     | 2411  | ... | 1       | 2       | 8.7    | true  |
| ...             | ...       | ...   | ... | ...     | ...     | ...    | ...   |
Introduction to Polars

Query optimizations

(
  pl.scan_csv("vacation_rentals.csv")
  .filter(pl.col("type") == "Villa")

)
Introduction to Polars

Query optimizations

(
  pl.scan_csv("vacation_rentals.csv")
  .filter(pl.col("type") == "Villa")
  .explain()
)
Csv SCAN [vacation_rentals.csv]
PROJECT */8 COLUMNS
SELECTION: [(col("type")) == (String(Villa))]
Introduction to Polars

Filter conditions

  • pl.col("price") < 500
  • pl.col("type").is_in(["Cottage", "Villa"])
  • pl.col("bedrooms").is_between(3,4)
1 https://docs.pola.rs/api/python/stable/reference/expressions/boolean.html
Introduction to Polars

Using standard Python comparison operators

Operator Meaning
== Equal to
!= Not equal to
< Less than
<= Less than or equal
> Greater than
>= Greater than or equal
Introduction to Polars

Let's practice!

Introduction to Polars

Preparing Video For Download...