Recurrent Neural Networks (RNN's) voor taalmodellen met Keras
David Cecchini
Data Scientist
20 klassen, 80% accuracy. Is het model goed?
Geen idee!
True en predicted per klasse controleren

$$\text{Precision}_{\text{class}} = \frac{\text{Correct}_{\text{class}}}{\text{Predicted}_{\text{class}}}$$
In dit voorbeeld:
$$ \text{Precision}_{\text{sci.space}} = \frac{76}{76+7+9} = 0.83 $$ $$ \text{Precision}_{\text{alt.atheism}} = \frac{1}{2+1+0} = 0.33 $$ $$ \text{Precision}_{\text{soc.religion.christian}} = \frac{3}{0+2+3} = 0.60 $$
$$\text{Recall}_{\text{class}} = \frac{\text{Correct}_{class}}{N_\text{class}}$$
In dit voorbeeld:
$$ \text{Recall}_{\text{sci.space}} = \frac{76}{76+2+0} = 0.97 $$ $$ \text{Recall}_{\text{alt.atheism}} = \frac{1}{7+1+2} = 0.10 $$ $$ \text{Recall}_{\text{soc.religion.christian}} = \frac{3}{9+0+3} = 0.25 $$
$$\text{F1 score} = 2 * \frac{\text{precision}_{\text{class}} * \text{recall}_{\text{class}}}{\text{precision}_{\text{class}} + \text{recall}_{\text{class}}}$$
In dit voorbeeld:
$$ f1score_{sci.space} = 2 \frac{0.83 * 0.97}{0.83 + 0.97} = 0.89 $$ $$ f1score_{alt.atheism} = 2 \frac{033 * 0.10}{033 + 0.10} = 0.15 $$ $$ f1score_{soc.religion.christian} = 2 \frac{060 * 0.25}{060 + 0.25} = 0.35 $$
from sklearn.metrics import confusion_matrix# Bouw de confusion matrix confusion_matrix(y_true, y_pred)
Output:
array([[76, 2, 0],
[ 7, 1, 2],
[ 9, 0, 3]], dtype=int64)
Metrieken uit sklearn
# Functies van sklearn
from sklearn.metrics import confusion_matrix
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import f1_score
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
# Accuracy
print(accuracy_score(y_true, y_pred))
$ 0.80
Voeg average=None toe aan precision-, recall- en f1-functies
print(precision_score(y_true, y_pred, average=None))
print(recall_score(y_true, y_pred, average=None))
print(f1_score(y_true, y_pred, average=None))
$ array([0.83, 0.33, 0.60])
$ array([0.97, 0.10, 0.25])
$ array([0.89, 0.15, 0.35])
Eén functie meet alles:
lab_names = ['sci.space', 'alt.atheism', 'soc.religion.christian']
print(classification_report(y_true, y_pred, target_names=lab_names))
precision recall f1-score support
sci.space 0.83 0.97 0.89 78
alt.atheism 0.33 0.10 0.15 10
soc.religion.christian 0.60 0.25 0.35 12
micro avg 0.80 0.80 0.80 100
macro avg 0.59 0.44 0.47 100
weighted avg 0.75 0.80 0.76 100
Recurrent Neural Networks (RNN's) voor taalmodellen met Keras