Performance estimation

Monitoring Machine Learning in Python

Hakim Elakhrass

Co-founder and CEO of NannyML

The algorithms

CBPE - confidence based performance estimation
DLE - direct loss estimation

Direct loss estimation

Used for regression tasks
Estimates loss function of monitored model
LGBM is used as an "extra" model
NannyML supports various regression metrics like MAE, MSE or RMSE

The image shows how the DLE algorithm works. First, production data comes to the algorithm, and based on the predictions and production data, the analysis set is created. The analysis set is passed to the extra model to predict the performance.

DLE - code implementation

# Initialize the DLE algorithm
estimator = nannyml.DLE(
    y_true='target',
    y_pred='y_pred',
    metrics=['rmse'],
    timestamp_column_name='timestamp',
    chunk_period='d'
    feature_column_names=features,
    tune_hyperparameters=False
)

# Fit the algorithm
estimator.fit(reference)
results = estimator.estimate(analysis)

Confidence based performance estimation

Used for binary and multiclass classification problems
Leverages confidence scores to estimate confusion matrix
Estimates any classification performance metric

The image shows how the CBPE algorithm works. First, we pass the analysis set to the CBPE algorithm and predict the confusion matrix. Based on the confusion matrix, other classification metrics are calculated.

CBPE - code implementation

# Initialize the CBPE algorithm
estimator = nannyml.CBPE(
    y_pred_proba='y_pred_proba',
    y_pred='y_pred',
    y_true='targets',
    timestamp_column_name='timestamp',
    metrics=['roc_auc'],
    chunk_period='d',
    problem_type='classification_binary',
)

# Fit the algorithm
estimator.fit(reference)
results = estimator.estimate(analysis)

Results

results.plot().show()

The resulting plot shows the estimated RMSE metrics over time.

Let's practice!

Monitoring Machine Learning in Python