Artificial Intelligence (AI) Concepts in Python
Nemanja Radojkovic
Senior Data Scientist
Test data ? training data
Simplest approach (Hold-out method)
Code example:
from sklearn.model_selection \
import train_test_split
X_train, X_test, y_train, y_test = \
train_test_split(X, y, test_size=0.4)
Use the default model configuration/hyper-parameters:
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
Use a custom model configuration/hyper-parameters:
model = RandomForestClassifier(n_estimators=500, # Number of trees
max_depth=20) # Tree depth
Start the training procedure:
model.fit(X_train, y_train)
Generic syntax
model.predict(X=X_test)
Example: News title classifier
model.predict(X=['Denver Nuggets win against GSW and clinch playoff spot!'])
Out: ['Sport']
y_predicted = model.predict(X_test_all)
Is y_predicted == y_true ?
from sklearn.metrics import confusion_matrix
confusion_matrix(y_true, y_predicted)
y_predicted = model.predict(X_test_all)
Is y_predicted == y_true ?
from sklearn.metrics import confusion_matrix
confusion_matrix(y_true, y_predicted)
The confusion matrix:
REALITY: YES | REALITY: NO | |
---|---|---|
PREDICTION: YES | 560 | 80 |
PREDICTION: NO | 50 | 210 |
Diabetes present | No diabetes | |
---|---|---|
Diabetes predicted | TRUE POSITIVES | |
No diabetes predicted |
TRUE POSITIVE = the model predicts diabetes and the patient is actually suffering from it.
Diabetes present | No diabetes | |
---|---|---|
Diabetes predicted | true positives | |
No diabetes predicted | TRUE NEGATIVES |
TRUE POSITIVE = the model predicts diabetes and the patient is actually suffering from it.
TRUE NEGATIVE = model predicts no diabetes and the patient is actually healthy.
Diabetes present | No diabetes | |
---|---|---|
Diabetes predicted | true positives | FALSE POSITIVES |
No diabetes predicted | true negatives |
TRUE POSITIVE = the model predicts diabetes and the patient is actually suffering from it.
TRUE NEGATIVE = model predicts no diabetes and the patient is actually healthy.
FALSE POSITIVE = model predicts diabetes but the patient is actually healthy (Type I error).
Diabetes present | No diabetes | |
---|---|---|
Diabetes predicted | true positives | false positives |
No diabetes predicted | FALSE NEGATIVES | true negatives |
TRUE POSITIVE = the model predicts diabetes and the patient is really suffering from it.
TRUE NEGATIVE = model predicts no diabetes and the patient is really healthy.
FALSE POSITIVE = model predicts diabetes but the patient is actually healthy (Type I error).
FALSE NEGATIVE = diabetes present but not detected by the model (Type II error).
Metrics:
Using Python and scikit-learn:
from sklearn.metrics import accuracy_score, precision_score, recall_score
accuracy_score(y_true, y_predicted) # Same arguments for precision and recall
Result: 0.88
Artificial Intelligence (AI) Concepts in Python