Explaining unsupervised models

Explainable AI in Python

Fouad Trad

Machine Learning Engineer

Clustering

Group similar data points without pre-defined labels

Image showing data with 2 features divided into 3 clusters, each having a centroid.

Silhouette score

Measures clustering's quality
Ranges from -1 to 1
- 1 → well-separated clusters

Image showing well-separated clusters.

Silhouette score

Measures clustering's quality
Ranges from -1 to 1
- 1 → well-separated clusters
- -1 → points incorrectly assigned

Image showing clusters with no clear separation.

Feature impact on cluster quality

Image showing the clustering result of a dataset with 2 features fed to the clustering algorithm.

Feature impact on cluster quality

Image showing the clustering result after removing one feature and retraining the model.

Feature impact on cluster quality

Image showing the formula to derive the impact of the removed feature as the difference between the silhouette scores where both features are present and the silhouette score when the feature is removed.

$\text{Impact(}f) > 0$ → positive contribution for $f$
$\text{Impact(}f) < 0$ → $f$ introduces noise

Student Performance dataset

age	health status	absences	G1	G2	G3
18	3	4	0	11	11
17	3	2	9	11	11
15	3	6	12	13	12
15	5	0	14	14	14
16	5	0	11	13	13

X: array containing features

Computing feature impact on cluster quality

from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score


kmeans = KMeans(n_clusters=2).fit(X)

original_score = silhouette_score(X, kmeans.labels_)


for i in range(X.shape[1]):

    X_reduced = np.delete(X, i, axis=1)

    kmeans.fit(X_reduced)

    new_score = silhouette_score(X_reduced, kmeans.labels_)

    impact = original_score - new_score
    print(f'Feature {column_names[i]}: Impact = {impact}')

Computing feature impact on cluster quality

Feature age: Impact = 0.05199181662741281
Feature health status: Impact = 0.06046737420227638
Feature absences: Impact = 0.031290940582026694
Feature G1: Impact = -0.025746421940652353
Feature G2: Impact = -0.02578292339364119
Feature G3: Impact = -0.03163419458330158

Adjusted rand index (ARI)

Measures how well cluster assignments match

Image showing two similar cluster assignments for a given dataset.

Maximum ARI = 1 → perfect cluster alignment

Adjusted rand index (ARI)

Measures how well cluster assignments match

Image showing two different cluster assignments for the same dataset.

Maximum ARI = 1 → perfect cluster alignment
Lower ARI → greater difference in clusterings

Feature importance for cluster assignments

Remove features one at a time
$\text{Importance}(f) = 1 - \text{ARI (original clusters, modifed clusters)}$
Low $\text(ARI)$ → high $\text(1 - ARI)$ → important feature

Feature importance for cluster assignment

from sklearn.metrics import adjusted_rand_score


kmeans = KMeans(n_clusters=2).fit(X)
original_clusters = kmeans.predict(X)


for i in range(X.shape[1]):

    X_reduced = np.delete(X, i, axis=1)

    reduced_clusters = kmeans.fit_predict(X_reduced)

    importance = 1 - adjusted_rand_score(original_clusters, reduced_clusters)
    print(f'{df.columns[i]}: {importance}')

age: 0.0
health status: 0.9995376368119572
absences: 0.0
G1: 0.0
G2: 0.6204069909514572
G3: 0.6204069909514572

Let's practice!

Explainable AI in Python