Python'da Model Doğrulama
Kasey Jones
Data Scientist
# En İyi Skor
rs.best_score_
5.45
# En İyi Parametreler
rs.best_params_
{'max_depth': 4, 'max_features': 8, 'min_samples_split': 4}
# En İyi Tahminleyici
rs.best_estimator_
rs.cv_results_rs.cv_results_['mean_test_score']
array([5.45, 6.23, 5.87, 5,91, 5,67])
# Seçilen Parametreler:
rs.cv_results_['params']
[{'max_depth': 10, 'min_samples_split': 8, 'n_estimators': 25},
{'max_depth': 4, 'min_samples_split': 8, 'n_estimators': 50},
...]
Maksimum derinlikleri gruplayın:
max_depth = [item['max_depth'] for item in rs.cv_results_['params']]
scores = list(rs.cv_results_['mean_test_score'])
d = pd.DataFrame([max_depth, scores]).T
d.columns = ['Max Depth', 'Score']
d.groupby(['Max Depth']).mean()
Max Depth Score
2.0 0.677928
4.0 0.753021
6.0 0.817219
8.0 0.879136
10.0 0.896821
Çıktının kullanımları:
Max Depth Score
2.0 0.677928
4.0 0.753021
6.0 0.817219
8.0 0.879136
10.0 0.896821
rs.best_estimator_ en iyi modelin bilgisini içerir
rs.best_estimator_
RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=8,
max_features=8, max_leaf_nodes=None, min_impurity_decrease=0.0,
min_impurity_split=None, min_samples_leaf=1,
min_samples_split=12, min_weight_fraction_leaf=0.0,
n_estimators=20, n_jobs=1, oob_score=False, random_state=1111,
verbose=0, warm_start=False)
Rastgele orman:
rfr.score(X_test, y_test)
6.39
Gradyan Artırma:
gb.score(X_test, y_test)
6.23
Yeni verileri tahmin edin:
rs.best_estimator_.predict(<new_data>)
Parametreleri kontrol edin:
random_search.best_estimator_.get_params()
Modeli sonra kullanmak için kaydedin:
from sklearn.externals import joblib
joblib.dump(rfr, 'rfr_best_<date>.pkl')
Python'da Model Doğrulama