Recurrent Neural Networks (RNNs) for Language Modeling with Keras
David Cecchini
Data Scientist
20 classes task with 80% accuracy. Is the model good?
I have no idea!
Checking true and predicted for each class
$$\text{Precision}_{\text{class}} = \frac{\text{Correct}_{\text{class}}}{\text{Predicted}_{\text{class}}}$$
In the example:
$$ \text{Precision}_{\text{sci.space}} = \frac{76}{76+7+9} = 0.83 $$ $$ \text{Precision}_{\text{alt.atheism}} = \frac{1}{2+1+0} = 0.33 $$ $$ \text{Precision}_{\text{soc.religion.christian}} = \frac{3}{0+2+3} = 0.60 $$
$$\text{Recall}_{\text{class}} = \frac{\text{Correct}_{class}}{N_\text{class}}$$
In the example:
$$ \text{Recall}_{\text{sci.space}} = \frac{76}{76+2+0} = 0.97 $$ $$ \text{Recall}_{\text{alt.atheism}} = \frac{1}{7+1+2} = 0.10 $$ $$ \text{Recall}_{\text{soc.religion.christian}} = \frac{3}{9+0+3} = 0.25 $$
$$\text{F1 score} = 2 * \frac{\text{precision}_{\text{class}} * \text{recall}_{\text{class}}}{\text{precision}_{\text{class}} + \text{recall}_{\text{class}}}$$
In the example:
$$ f1score_{sci.space} = 2 \frac{0.83 * 0.97}{0.83 + 0.97} = 0.89 $$ $$ f1score_{alt.atheism} = 2 \frac{033 * 0.10}{033 + 0.10} = 0.15 $$ $$ f1score_{soc.religion.christian} = 2 \frac{060 * 0.25}{060 + 0.25} = 0.35 $$
from sklearn.metrics import confusion_matrix
# Build the confusion matrix confusion_matrix(y_true, y_pred)
Output:
array([[76, 2, 0],
[ 7, 1, 2],
[ 9, 0, 3]], dtype=int64)
Metrics from sklearn
# Functions of sklearn
from sklearn.metrics import confusion_matrix
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import f1_score
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
# Accuracy
print(accuracy_score(y_true, y_pred))
$ 0.80
Add average=None
to precison, recall and f1 score functions
print(precision_score(y_true, y_pred, average=None))
print(recall_score(y_true, y_pred, average=None))
print(f1_score(y_true, y_pred, average=None))
$ array([0.83, 0.33, 0.60])
$ array([0.97, 0.10, 0.25])
$ array([0.89, 0.15, 0.35])
One function measure all:
lab_names = ['sci.space', 'alt.atheism', 'soc.religion.christian']
print(classification_report(y_true, y_pred, target_names=lab_names))
precision recall f1-score support
sci.space 0.83 0.97 0.89 78
alt.atheism 0.33 0.10 0.15 10
soc.religion.christian 0.60 0.25 0.35 12
micro avg 0.80 0.80 0.80 100
macro avg 0.59 0.44 0.47 100
weighted avg 0.75 0.80 0.76 100
Recurrent Neural Networks (RNNs) for Language Modeling with Keras