Modelvalidatie in Python
Kasey Jones
Data Scientist

| Dataset | Definitie |
|---|---|
| Train | De data die wordt gebruikt om modellen te fitten |
| Test (holdout-sample) | De data om modelprestatie te beoordelen |
Voorbeeldverhoudingen
import pandas as pd
tic_tac_toe = pd.read_csv("tic-tac-toe.csv")
X = pd.get_dummies(tic_tac_toe.iloc[:,0:9])
y = tic_tac_toe.iloc[:, 9]
Python-cursussen over dummyvariabelen:
X_train, X_test, y_train, y_test =\
train_test_split(X, y, test_size=0.2, random_state=1111)
Parameters:
test_sizetrain_sizerandom_stateWat doen we bij het testen van verschillende modelparameters?

X_temp, X_test, y_temp, y_test =\
train_test_split(X, y, test_size=0.2, random_state=1111)
X_train, X_val, y_train, y_val =\
train_test_split(X_temp, y_temp, test_size=0.25, random_state=11111)
Modelvalidatie in Python