Memilih model final Anda

Validasi Model di Python

Kasey Jones

Data Scientist

# Skor Terbaik
rs.best_score_
5.45
# Parameter Terbaik
rs.best_params_
{'max_depth': 4, 'max_features': 8, 'min_samples_split': 4}
# Estimator Terbaik
rs.best_estimator_
Validasi Model di Python

Atribut lain

rs.cv_results_

rs.cv_results_['mean_test_score']
array([5.45, 6.23, 5.87, 5,91, 5,67])
# Parameter yang dipilih:
rs.cv_results_['params']
[{'max_depth': 10, 'min_samples_split': 8, 'n_estimators': 25},
 {'max_depth': 4, 'min_samples_split': 8, 'n_estimators': 50},
 ...]
Validasi Model di Python

Menggunakan .cv_results_

Kelompokkan max_depth:

max_depth = [item['max_depth'] for item in rs.cv_results_['params']]
scores = list(rs.cv_results_['mean_test_score'])
d = pd.DataFrame([max_depth, scores]).T
d.columns = ['Max Depth', 'Score']
d.groupby(['Max Depth']).mean()
Max Depth  Score        
2.0        0.677928
4.0        0.753021
6.0        0.817219
8.0        0.879136
10.0       0.896821
Validasi Model di Python

Atribut lain (lanjutan)

Kegunaan keluaran:

  • Visualisasikan efek tiap parameter
  • Tarik kesimpulan parameter mana yang paling berdampak
Max Depth  Score        
2.0        0.677928
4.0        0.753021
6.0        0.817219
8.0        0.879136
10.0       0.896821
Validasi Model di Python

Memilih model terbaik

rs.best_estimator_ memuat informasi model terbaik

rs.best_estimator_
RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=8,
           max_features=8, max_leaf_nodes=None, min_impurity_decrease=0.0,
           min_impurity_split=None, min_samples_leaf=1,
           min_samples_split=12, min_weight_fraction_leaf=0.0,
           n_estimators=20, n_jobs=1, oob_score=False, random_state=1111,
           verbose=0, warm_start=False)
Validasi Model di Python

Membandingkan jenis model

Random forest:

rfr.score(X_test, y_test)
6.39

Gradient Boosting:

gb.score(X_test, y_test)
6.23
Validasi Model di Python

Menggunakan .best_estimator_

Prediksi data baru:

rs.best_estimator_.predict(<new_data>)

Periksa parameter:

random_search.best_estimator_.get_params()

Simpan model untuk nanti:

from sklearn.externals import joblib

joblib.dump(rfr, 'rfr_best_<date>.pkl')
Validasi Model di Python

Ayo berlatih!

Validasi Model di Python

Preparing Video For Download...