Rolling and cumulative aggregations

Data Transformation with Polars

Liam Brannigan

Data Scientist & Polars Contributor

Why rolling statistics?

Line plot of electricity prices

Data Transformation with Polars

Why rolling statistics?

Line plot of electricity prices with rolling statistics

Data Transformation with Polars

Rolling mean on a column

prices.with_columns(

)
Data Transformation with Polars

Rolling mean on a column

prices.with_columns(
    pl.col("price").rolling_mean(            )
)
Data Transformation with Polars

Rolling mean on a column

prices.with_columns(
    pl.col("price").rolling_mean(window_size=3)
)
Data Transformation with Polars

Rolling mean on a column

prices.with_columns(
    pl.col("price").rolling_mean(window_size=3).alias("smoothed")
)
Data Transformation with Polars

Rolling mean on a column

prices.with_columns(
    pl.col("price").rolling_mean(window_size=3).alias("smoothed")
)
| time                | price | solar | smoothed |
| ---                 | ---   | ---   | ---      |
| datetime[µs]        | f64   | f64   | f64      |
|---------------------|-------|-------|----------|
| 2025-07-05 00:00:00 | 34.2  | 0.0   | null     |
| 2025-07-05 01:00:00 | 28.7  | 0.0   | null     |

Data Transformation with Polars

Rolling mean on a column

prices.with_columns(
    pl.col("price").rolling_mean(window_size=3).alias("smoothed")
)
| time                | price | solar | smoothed |
| ---                 | ---   | ---   | ---      |
| datetime[µs]        | f64   | f64   | f64      |
|---------------------|-------|-------|----------|
| 2025-07-05 00:00:00 | 34.2  | 0.0   | null     |
| 2025-07-05 01:00:00 | 28.7  | 0.0   | null     |
| 2025-07-05 02:00:00 | 27.5  | 0.0   | 30.1     |
Data Transformation with Polars

Centering the window

prices.with_columns(
    pl.col("price").rolling_mean(window_size=3,            )
)
Data Transformation with Polars

Centering the window

prices.with_columns(
    pl.col("price").rolling_mean(window_size=3, center=True).alias("centered")
)
| time                | price | solar | centered |
| ---                 | ---   | ---   | ---      |
| datetime[µs]        | f64   | f64   | f64      |
|---------------------|-------|-------|----------|
| 2025-07-05 00:00:00 | 34.2  | 0.0   | null     |
| 2025-07-05 01:00:00 | 28.7  | 0.0   | 30.1     |
| 2025-07-05 02:00:00 | 27.5  | 0.0   | 27.1     |
Data Transformation with Polars

Other rolling expressions

  • .rolling_sum() - rolling total
  • .rolling_std() - rolling standard deviation
  • .rolling_max() / .rolling_min()
Data Transformation with Polars

Rolling on a DataFrame

prices.rolling(                                )


Data Transformation with Polars

Rolling on a DataFrame

prices.rolling(index_column="time",            )


Data Transformation with Polars

Rolling on a DataFrame

prices.rolling(index_column="time", period="3h")


Data Transformation with Polars

Rolling on a DataFrame

prices.rolling(index_column="time", period="3h").agg(
    pl.all().mean()
)
Data Transformation with Polars

Rolling on a DataFrame

prices.rolling(index_column="time", period="3h").agg(
    pl.all().mean()
)
| time                | price | solar  |
| ---                 | ---   | ---    |
| datetime[µs]        | f64   | f64    |
|---------------------|-------|------- |
| 2025-07-05 10:00:00 | 1.8   | 247.3  |
| 2025-07-05 11:00:00 | 1.4   | 347.7  |
| 2025-07-05 12:00:00 | 0.9   | 420.7  |
Data Transformation with Polars

Why cumulative statistics?

Time series chart of solar power

  • Task: calculate cumulative values
Data Transformation with Polars

Why cumulative statistics?

Time series chart of solar power with cumulative max of solar power

Data Transformation with Polars

Cumulative statistics

prices.with_columns(
    pl.col("solar")
)
Data Transformation with Polars

Cumulative statistics

prices.with_columns(
    pl.col("solar").cum_max()
)
Data Transformation with Polars

Cumulative statistics

prices.with_columns(
    pl.col("solar").cum_max().alias("cumulative_solar")
)
Data Transformation with Polars

Cumulative statistics

prices.with_columns(
    pl.col("solar").cum_max().alias("cumulative_solar")
)
| time                | price | solar | cumulative_solar |
| ---                 | ---   | ---   | ---              |
| datetime[µs]        | f64   | f64   | f64              |
|---------------------|-------|-------|------------------|
| ...                 | ...   | ...   | ...              |
| 2025-07-05 10:00:00 | 2.3   | 419.0 | 419.0            |
| 2025-07-05 11:00:00 | 0.5   | 481.0 | 481.0            |
| 2025-07-05 12:00:00 | 0.0   | 462.0 | 481.0            |
| ...                 | ...   | ...   | ...              |
Data Transformation with Polars

Cumulative sum for running totals

prices.with_columns(
    pl.col("solar").cum_sum().alias("total_energy")
)
| time                | price | solar | total_energy |
| ---                 | ---   | ---   | ---          |
| datetime[µs]        | f64   | f64   | f64          |
|---------------------|-------|-------|--------------|
| 2025-07-05 06:00:00 | 22.1  | 62.0  | 62.0         |
| 2025-07-05 07:00:00 | 4.5   | 100.0 | 162.0        |
| 2025-07-05 08:00:00 | 1.7   | 180.0 | 342.0        |
Data Transformation with Polars

Let's practice!

Data Transformation with Polars

Preparing Video For Download...