Logistic regression and regularization

Linear Classifiers in Python

Michael (Mike) Gelbart

Instructor, The University of British Columbia

Regularized logistic regression

Linear Classifiers in Python

Regularized logistic regression

Linear Classifiers in Python

How does regularization affect training accuracy?

lr_weak_reg = LogisticRegression(C=100)
lr_strong_reg = LogisticRegression(C=0.01)

lr_weak_reg.fit(X_train, y_train) lr_strong_reg.fit(X_train, y_train)
lr_weak_reg.score(X_train, y_train) lr_strong_reg.score(X_train, y_train)
1.0
0.92

$\text{regularized loss = original loss + large coefficient penalty}$

  • more regularization: lower training accuracy
Linear Classifiers in Python

How does regularization affect test accuracy?

lr_weak_reg.score(X_test, y_test)
0.86
lr_strong_reg.score(X_test, y_test)
0.88

$\text{regularized loss = original loss + large coefficient penalty}$

  • more regularization: lower training accuracy
  • more regularization: (almost always) higher test accuracy
Linear Classifiers in Python

L1 vs. L2 regularization

  • Lasso = linear regression with L1 regularization
  • Ridge = linear regression with L2 regularization
  • For other models like logistic regression we just say L1, L2, etc.
lr_L1 = LogisticRegression(solver='liblinear', penalty='l1')
lr_L2 = LogisticRegression() # penalty='l2' by default

lr_L1.fit(X_train, y_train)
lr_L2.fit(X_train, y_train)
plt.plot(lr_L1.coef_.flatten())
plt.plot(lr_L2.coef_.flatten())
Linear Classifiers in Python

L2 vs. L1 regularization

Linear Classifiers in Python

Let's practice!

Linear Classifiers in Python

Preparing Video For Download...