Modeltraining

End-to-End Machine Learning

Joshua Stapleton

Machine Learning Engineer

Occam's scheermes

  • De simpelste toereikende uitleg is het best
  • Kies bij voorkeur eenvoudige modellen

Voorbeeldafbeelding die het principe van Occam's scheermes toont

End-to-End Machine Learning

Modelkeuzes

Logistische regressie

  • Vindt de beslisgrens tussen klassen
  • sklearn.linear_model.LogisticRegression

Support Vector Classifier

  • Vindt een vlak dat klassen scheidt
  • sklearn.svm.SVC

Beslisboom

  • Maakt simpele ‘regels’ om data te classificeren
  • sklearn.tree.DecisionTreeClassifier

Random Forest

  • Combineert meerdere beslisbomen
  • sklearn.ensemble.RandomForestClassifier
End-to-End Machine Learning

Andere modellen

Deep learning-modellen

  • Neurale netwerken
  • Convolutionele neurale netwerken
  • Generative Pretrained Transformer (GPT)

K-Nearest Neighbors (KNN)

  • Supervised learning-algoritme

XGBoost

End-to-End Machine Learning

Trainingsprincipes

Model:

  • Gebruikt opgeschoonde data met features verwerkt
  • Leert patronen in trainingsdata
  • Doel: voorspellen of iemand hartziekte heeft

Principes:

  • Model moet generaliseren naar onzichtbare data (buiten de trainingsset)
  • Houd een deel apart om na training te testen
  • Train/test-splits vaak 70/30 of 80/20
  • Gebruik sklearn.model_selection.train_test_split
End-to-End Machine Learning

Een model trainen

# Importing necessary libraries
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Split the data into training and testing sets (80:20) X_train, X_test, y_train, y_test = train_test_split(features, heart_disease_y, test_size=0.2, random_state=42)
# Define the models logistic_model = LogisticRegression(max_iter=200)
# Train the model logistic_model.fit(X_train, y_train)
End-to-End Machine Learning

Modelvoorspellingen krijgen

# Jane Doe's health data, for example: [age, cholesterol level, blood pressure, etc.]
jane_doe_data = [45, 230, 120, ...]

# Reshape the data to 2D, because scikit-learn expects a 2D array-like input jane_doe_data = jane_doe_data.reshape(1, -1)
# Use the model to predict Jane's heart disease diagnosis probabilities jane_doe_probabilities = logistic_model.predict_proba(jane_doe_data) jane_doe_prediction = logistic_model.predict(jane_doe_data)
End-to-End Machine Learning

Modelvoorspellingen (vervolg)

# Print the probabilities
print(f"Jane Doe's predicted probabilities: {jane_doe_probabilities[0]}")
print(f"Jane Doe's predicted health condition: {jane_doe_prediction[0]}")
Jane Doe's predicted health condition probabilities: [0.2 0.8]

Jane Doe's predicted health condition: 1
End-to-End Machine Learning

Laten we oefenen!

End-to-End Machine Learning

Preparing Video For Download...