Hyperparameter tuning in deep learning

Predicting CTR with Machine Learning in Python

Kevin Huo

Instructor

Learning rate and number of iterations

Examples of learning rates

  • Weights are updated iteratively
    • Uses back-propagation
  • A good learning rate will result in loss dropping quickly and stabilizing
    • Shown in red line
  • Too high of a learning rate will result in an "overshoot" and very high loss
    • Shown in yellow line
Predicting CTR with Machine Learning in Python

Choosing hidden layers

Effect of hidden layer size

  • Increase in performance up to certain level of complexity, then drop-off afterwards.
Predicting CTR with Machine Learning in Python

Grid search

param_grid = {'max_iter': [10, 20], 
              'hidden_layer_sizes': [(8, ), (16, )]}
clf = GridSearchCV(
  estimator = MLPClassifier(), param_grid = param_grid, 
  n_jobs = 4)
print(clf.best_score_)
print(clf.best_estimator_)
0.65
MLPClassifier(hidden_layer_size = (16,), ...)
Predicting CTR with Machine Learning in Python

Real life extensions

  • Batch size and epochs are also potential hyperparameters
    • Batch size is for mini-batch (training is done in small batches), and epochs are for the number of iterations through whole training data
  • Initialization of weights can vary and affect results
    • Examples of different initializations: uniformly distributed, normally distributed, etc.
  • Keras and Tensorflow are often used rather than sklearn
    • This is due to limited functionality on sklearn in comparison
Predicting CTR with Machine Learning in Python

Let's practice!

Predicting CTR with Machine Learning in Python

Preparing Video For Download...