Explainable AI in Python
Fouad Trad
Machine Learning Engineer
Group similar data points without pre-defined labels
age | health status | absences | G1 | G2 | G3 |
---|---|---|---|---|---|
18 | 3 | 4 | 0 | 11 | 11 |
17 | 3 | 2 | 9 | 11 | 11 |
15 | 3 | 6 | 12 | 13 | 12 |
15 | 5 | 0 | 14 | 14 | 14 |
16 | 5 | 0 | 11 | 13 | 13 |
X
: array containing features
from sklearn.cluster import KMeans from sklearn.metrics import silhouette_score
kmeans = KMeans(n_clusters=2).fit(X)
original_score = silhouette_score(X, kmeans.labels_)
for i in range(X.shape[1]):
X_reduced = np.delete(X, i, axis=1)
kmeans.fit(X_reduced)
new_score = silhouette_score(X_reduced, kmeans.labels_)
impact = original_score - new_score print(f'Feature {column_names[i]}: Impact = {impact}')
Feature age: Impact = 0.05199181662741281
Feature health status: Impact = 0.06046737420227638
Feature absences: Impact = 0.031290940582026694
Feature G1: Impact = -0.025746421940652353
Feature G2: Impact = -0.02578292339364119
Feature G3: Impact = -0.03163419458330158
from sklearn.metrics import adjusted_rand_score
kmeans = KMeans(n_clusters=2).fit(X) original_clusters = kmeans.predict(X)
for i in range(X.shape[1]):
X_reduced = np.delete(X, i, axis=1)
reduced_clusters = kmeans.fit_predict(X_reduced)
importance = 1 - adjusted_rand_score(original_clusters, reduced_clusters) print(f'{df.columns[i]}: {importance}')
age: 0.0
health status: 0.9995376368119572
absences: 0.0
G1: 0.0
G2: 0.6204069909514572
G3: 0.6204069909514572
Explainable AI in Python