Validating your predictions

Building Recommendation Engines in Python

Rob O'Callaghan

Director of Data

Hold-out sets

Large matrix

Building Recommendation Engines in Python

Hold-out sets

Large matrix with last column highlighted

Building Recommendation Engines in Python

Hold-out sets

Large matrix with last column highlighted next to unhighlighted matrix

Building Recommendation Engines in Python

Hold-out sets

Large matrix with last column highlighted next to matrix with irregular highlighting

Building Recommendation Engines in Python

Hold-out sets

Large matrix with bottom rows highlighted as holdout set next to unhighlighted matrix

Building Recommendation Engines in Python

Hold-out sets

Large matrix with bottom rows highlighted as holdout set next to matrix with bottom left corner highlighted

Building Recommendation Engines in Python

Separating the hold-out set

actual_values = act_ratings_df.iloc[:20, :100].values

act_ratings_df.iloc[:20, :100] = np.nan

Generate predictions as before.

predicted_values = calc_pred_ratings_df.iloc[:20, :100].values
Building Recommendation Engines in Python

Masking the hold-out set

mask = ~np.isnan(actual_values)
print(actual_values[mask])
[4.  4.  5.  3.  3.  ...]
print(predicted_values[mask])
[3.76, 4.35,  4.95,  3.5869079 3.686337   ...]
Building Recommendation Engines in Python

Introducing RMSE (root mean squared error)

Table of actual versus predicted values

Building Recommendation Engines in Python

Introducing RMSE (root mean squared error)

Table of actual versus predicted values and their difference

Building Recommendation Engines in Python

Introducing RMSE (root mean squared error)

Table of actual versus predicted values, their difference, and their difference squared

Building Recommendation Engines in Python

Introducing RMSE (root mean squared error)

Table of actual versus predicted values, their difference, their difference squared and the RMSE equation

Building Recommendation Engines in Python

Introducing RMSE (root mean squared error)

Table of actual versus predicted values, their difference, their difference squared and the RMSE equation

Building Recommendation Engines in Python

Introducing RMSE (root mean squared error)

Table of actual versus predicted values, their difference, their difference squared and the RMSE equation

Building Recommendation Engines in Python

RMSE in Python

from sklearn.metrics import mean_squared_error

print(mean_squared_error(actual_values[mask],
                         predicted_values[mask],
                         squared=False))
3.6223997
Building Recommendation Engines in Python

Let's practice!

Building Recommendation Engines in Python

Preparing Video For Download...