Subsetting a DataFrame

Introduction to Polars

Liam Brannigan

Data Scientist and Polars Contributor

Selecting rows

rentals[0]
shape: (1, 8)
| name      | type    | price | bedrooms | doubles | singles | review | beach |
| ---       | ---     | ---   | ---      | ---     | ---     | ---    | ---   |
| str       | str     | i64   | i64      | i64     | i64     | f64    | bool  |
|-----------|---------|-------|----------|---------|---------|--------|-------|
| Seashells | Cottage | 540   | 4        | 2       | 2       | 8.7    | true  |
Introduction to Polars

Selecting rows

rentals[-1]
shape: (1, 8)
| name                | type    | price | bedrooms | doubles | singles | review | beach |
| ---                 | ---     | ---   | ---      | ---     | ---     | ---    | ---   |
| str                 | str     | i64   | i64      | i64     | i64     | f64    | bool  |
|---------------------|---------|-------|----------|---------|---------|--------|-------|
| Tehidy Holiday Park | Cottage | 637   | 4        | 2       | 4       | 9.0    | false |
Introduction to Polars

Selecting a range of rows

rentals[1:3]
shape: (2, 8)
| name      | type    | price | bedrooms | doubles | singles | review | beach |
| ---       | ---     | ---   | ---      | ---     | ---     | ---    | ---   |
| str       | str     | i64   | i64      | i64     | i64     | f64    | bool  |
|-----------|---------|-------|----------|---------|---------|--------|-------|
| Seashells | Cottage | 540   | 4        | 2       | 2       | 8.7    | true  |
| Lake view | Cottage | 714   | 3        | 1       | 4       | 9.2    | true  |
Introduction to Polars

Creating a Series from a DataFrame column

rentals["name"]
shape: (2,)
Series: 'name' [str]
[
    "Waves"
    "Seashells"
]
  • Single column in [] gives a Series
  • Series useful for visualizations
Introduction to Polars

Selecting multiple columns with brackets

rentals[["name", "price"]]
shape: (5, 2)
| name        | price |
| ---         | ---   |
| str         | i64   |
|-------------|-------|
| Waves       | 540   |
| Seashells   | 540   |
| Lake view   | 714   |
| Piran View  | 775   |
| Palma Villa | 1772  |
Introduction to Polars

Selecting rows and columns

rentals[:3, ["name", "price"]]
shape: (3, 2)
| name      | price |
| ---       | ---   |
| str       | i64   |
|-----------|-------|
| Waves     | 540   |
| Seashells | 540   |
| Lake view | 714   |
Introduction to Polars

Subsetting columns with .select()

rentals.select("name", "price")
shape: (3, 2)
| name        | price |
| ---         | ---   |
| str         | i64   |
|-------------|-------|
| Waves       | 540   |
| Seashells   | 540   |
| Lake view   | 714   |
df.select(["name", "price"])
Introduction to Polars

Brackets or select?

rentals.select("name","price")
shape: (3, 2)
| name        | price |
| ---         | ---   |
| str         | i64   |
|-------------|-------|
| Waves       | 540   |
| Seashells   | 540   |
| Lake view   | 714   |

.select() allows optimizations

rentals[["name","price"]]
shape: (3, 2)
| name        | price |
| ---         | ---   |
| str         | i64   |
|-------------|-------|
| Waves       | 540   |
| Seashells   | 540   |
| Lake view   | 714   |
Introduction to Polars

Let's practice!

Introduction to Polars

Preparing Video For Download...