Backtesting

Designing Forecasting Pipelines for Production

Rami Krispin

Senior Manager, Data Science and Engineering

Backtesting

Backtesting - splitting the data into several partitions, or windows, for training and testing models

Backtesting

The training splits

Backtesting

The test splits

Backtesting

The expanding window approach involves an increased sample size for each partition

Backtesting

The sliding window approach keeps each partition's sample size the same

Backtesting

In both approaches the size of the test sets remain unchanged

Backtesting

Box plot of RMSE scores for five models - Lasso, LightGBM, Linear Regression, Ridge, and XGBoost

Backtesting settings

Models

XGBoost
Light Gradient-Boosting Machine
Ridge
Lasso
Linear Regression

Backtesting settings

Partitions Settings

Backtesting parameters

Backtesting settings

Partitions Settings

Backtesting parameters

Backtesting settings

Partitions Settings

Backtesting parameters

Backtesting settings

Partitions Settings

Backtesting parameters

Backtesting settings

Model uncertainty

Conformal prediction intervals
- Can handle nonparametric and parametric models
95% level of significance

Required libraries

from mlforecast import MLForecast
from mlforecast.target_transforms import Differences
from mlforecast.utils import PredictionIntervals
from window_ops.expanding import expanding_mean
from lightgbm import LGBMRegressor
from xgboost import XGBRegressor
from sklearn.linear_model import Lasso, LinearRegression, Ridge
from sklearn.neural_network import MLPRegressor
from sklearn.ensemble import RandomForestRegressor
from utilsforecast.plotting import plot_series

Required libraries

import pandas as pd
import numpy as np
import requests
import json
import os
import datetime
from statistics import mean

Data preparation

ts = pd.read_csv("data/data.csv")
ts["ds"] = pd.to_datetime(ts["ds"])
ts = ts.sort_values("ds")
ts = ts[["unique_id", "ds", "y"]]


end = ts["ds"].max()
start = end - datetime.timedelta(hours = 24 * 31 * 25)
ts = ts[ts["ds"] >= start]


os.environ['NIXTLA_ID_AS_COL'] = '1'

Define the forecasting models

ml_models = {
    "lightGBM": LGBMRegressor(n_estimators=500, verbosity=-1),
    "xgboost": XGBRegressor(),
    "linear_regression": LinearRegression(),
    "lasso": Lasso(),
    "ridge": Ridge()
}


mlf = MLForecast(
    models= ml_models,  
    freq='h', 
    lags=list(range(1, 24)),  
    date_features=["month", "day", "dayofweek", "week", "hour"])

Set the backtesting parameters

Window Settings

partitions = 10  
step_size = 24  
h = 72

Prediction Intervals Settings

n_windows = 5
method = "conformal_distribution"
pi = PredictionIntervals(h=h, n_windows = n_windows , method = method)
levels = [95]

Training models with backtesting

bkt_df = mlf.cross_validation(
        df = ts,
        h = h,
        step_size = step_size,
        n_windows = partitions,
        prediction_intervals = pi, 
        level = levels)

Training models with backtesting

print(bkt_df.head())

     unique_id    ds                     cutoff                 y            lightGBM         
0    1            2024-04-22 00:00:00    2024-04-21 23:00:00    421082.60    421089.155837    
1    1            2024-04-22 01:00:00    2024-04-21 23:00:00    429728.30    425700.453391    
2    1            2024-04-22 02:00:00    2024-04-21 23:00:00    430690.96    424382.613668    
3    1            2024-04-22 03:00:00    2024-04-21 23:00:00    420094.58    409967.877157    
4    1            2024-04-22 04:00:00    2024-04-21 23:00:00    403292.36    393175.446116

Backtesting results in wide format

Information about the bkt_df DataFrame, containing 19 columns including unique ID, ds, y, and model values

Let's practice!

Designing Forecasting Pipelines for Production