The bias-variance tradeoff

Model Validation in Python

Kasey Jones

Data Scientist

Variance

  • Variance: following the training data too closely
    • Fails to generalize to the test data
    • Low training error but high testing error
    • Occurs when models are overfit and have high complexity
Model Validation in Python

Overfitting models (high variance)

Overfitting occurs when our predictions follow the training data too closely. If we drew a scatter plot, and all our predictions were exactly in-line with the real values, we are probably overfit.

Model Validation in Python

Bias

  • Bias: failing to find the relationship between the data and the response
    • High training/testing error
    • Occurs when models are underfit
Model Validation in Python

Underfitting models (high bias)

Underfitting occurs when there is a relationship between the variable we are predicting and the predictive variables in the model, but we failed to find this relationship.

Model Validation in Python

Optimal performance

Model Validation in Python

Parameters causing over/under fitting

rfc = RandomForestClassifier(n_estimators=100, max_depth=4)
rfc.fit(X_train, y_train)

print("Training: {0:.2f}".format(accuracy_score(y_train, train_predictions)))
Training: .84
print("Testing: {0:.2f}".format(accuracy_score(y_test, test_predictions)))
Testing: .77
Model Validation in Python
rfc = RandomForestClassifier(n_estimators=100, max_depth=14)
rfc.fit(X_train, y_train)

print("Training: {0:.2f}".format(accuracy_score(y_train, train_predictions)))
Training: 1.0
print("Testing: {0:.2f}".format(accuracy_score(y_test, test_predictions)))
Testing: .83
Model Validation in Python
rfc = RandomForestClassifier(n_estimators=100, max_depth=10)
rfc.fit(X_train, y_train)

print("Training: {0:.2f}".format(accuracy_score(y_train, train_predictions)))
Training: .89
print("Testing: {0:.2f}".format(accuracy_score(y_test, test_predictions)))
Testing: .86
Model Validation in Python

Remember, only you can prevent overfitting!

Model Validation in Python

Preparing Video For Download...