Pelatihan model

Machine Learning Ujung ke Ujung

Joshua Stapleton

Machine Learning Engineer

Pisau Cukur Occam

  • Penjelasan memadai paling sederhana adalah yang terbaik
  • Utamakan model sederhana saat memilih

Contoh gambar yang menunjukkan prinsip Pisau Cukur Occam

Machine Learning Ujung ke Ujung

Opsi pemodelan

Regresi Logistik

  • Menemukan batas keputusan antar kelas
  • sklearn.linear_model.LogisticRegression

Support Vector Classifier

  • Menemukan bidang pemisah kelas
  • sklearn.svm.SVC

Decision Tree

  • Menemukan aturan sederhana untuk mengklasifikasi data
  • sklearn.tree.DecisionTreeClassifier

Random Forest

  • Menggabungkan banyak decision tree
  • sklearn.ensemble.RandomForestClassifier
Machine Learning Ujung ke Ujung

Model lain

Model deep learning

  • Neural Network
  • Convolutional Neural Network
  • Generative Pretrained Transformer (GPT)

K-Nearest Neighbors (KNN)

  • Algoritma pembelajaran terawasi

XGBoost

Machine Learning Ujung ke Ujung

Prinsip pelatihan

Model:

  • Menggunakan data yang sudah dibersihkan dan diolah fiturnya
  • Mempelajari pola pada data latih
  • Bertujuan memprediksi target diagnosis penyakit jantung

Prinsip:

  • Model harus menggeneralisasi ke data baru (di luar data latih)
  • Sisihkan sebagian data untuk menguji model setelah pelatihan selesai
  • Umumnya bagi latih/uji 70/30 atau 80/20
  • Dapat menggunakan sklearn.model_selection.train_test_split
Machine Learning Ujung ke Ujung

Melatih model

# Importing necessary libraries
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Split the data into training and testing sets (80:20) X_train, X_test, y_train, y_test = train_test_split(features, heart_disease_y, test_size=0.2, random_state=42)
# Define the models logistic_model = LogisticRegression(max_iter=200)
# Train the model logistic_model.fit(X_train, y_train)
Machine Learning Ujung ke Ujung

Mendapatkan prediksi model

# Jane Doe's health data, for example: [age, cholesterol level, blood pressure, etc.]
jane_doe_data = [45, 230, 120, ...]

# Reshape the data to 2D, because scikit-learn expects a 2D array-like input jane_doe_data = jane_doe_data.reshape(1, -1)
# Use the model to predict Jane's heart disease diagnosis probabilities jane_doe_probabilities = logistic_model.predict_proba(jane_doe_data) jane_doe_prediction = logistic_model.predict(jane_doe_data)
Machine Learning Ujung ke Ujung

Mendapatkan prediksi model (lanj.)

# Print the probabilities
print(f"Jane Doe's predicted probabilities: {jane_doe_probabilities[0]}")
print(f"Jane Doe's predicted health condition: {jane_doe_prediction[0]}")
Jane Doe's predicted health condition probabilities: [0.2 0.8]

Jane Doe's predicted health condition: 1
Machine Learning Ujung ke Ujung

Ayo berlatih!

Machine Learning Ujung ke Ujung

Preparing Video For Download...