Penyetelan Hyperparameter di Python
Alex Scriven
Data Scientist
Memperkenalkan objek GridSearchCV:
sklearn.model_selection.GridSearchCV(
estimator,
param_grid, scoring=None, fit_params=None,
n_jobs=None, refit=True, cv='warn',
verbose=0, pre_dispatch='2*n_jobs',
error_score='raise-deprecating',
return_train_score='warn')
Langkah dalam Grid Search:
Input penting:
estimatorparam_gridcvscoringrefitn_jobsreturn_train_score
Input estimator:
Ingat:
Input param_grid:
Daripada daftar:
max_depth_list = [2, 4, 6, 8]
min_samples_leaf_list = [1, 2, 4, 6]
Menjadi:
param_grid = {'max_depth': [2, 4, 6, 8],
'min_samples_leaf': [1, 2, 4, 6]}
Input param_grid:
Ingat: Kunci di kamus param_grid harus hyperparameter yang valid.
Contoh untuk estimator Logistic Regression:
# Incorrect
param_grid = {'C': [0.1,0.2,0.5],
'best_choice': [10,20,50]}
ValueError: Invalid parameter best_choice for estimator LogisticRegression
Input cv:

Input scoring:
metrics Scikit LearnLihat semua fungsi scoring bawaan seperti ini:
from sklearn import metrics
sorted(metrics.SCORERS.keys())
Input refit:
GridSearchCV bisa dipakai sebagai estimator (untuk prediksi)Input n_jobs:
Kode berguna:
import os
print(os.cpu_count())
Hati-hati memakai semua core untuk pemodelan jika ingin mengerjakan hal lain!
Input return_train_score:
Membangun Objek GridSearchCV kita sendiri:
# Create the grid param_grid = {'max_depth': [2, 4, 6, 8], 'min_samples_leaf': [1, 2, 4, 6]}#Get a base classifier with some set parameters. rf_class = RandomForestClassifier(criterion='entropy', max_features='auto')
Menggabungkan semuanya:
grid_rf_class = GridSearchCV(
estimator = rf_class,
param_grid = parameter_grid,
scoring='accuracy',
n_jobs=4,
cv = 10,
refit=True,
return_train_score=True)
Karena refit disetel ke True, kita bisa langsung memakai objeknya:
#Fit the object to our data
grid_rf_class.fit(X_train, y_train)
# Make predictions
grid_rf_class.predict(X_test)
Penyetelan Hyperparameter di Python