Permutation importance

Explainable AI in Python

Fouad Trad

Machine Learning Engineer

Shuffling notes to determine instrument's importance

Image showing a band of musicians playing different instruments.

Permutation importance

Model-agnostic method
Assesses feature importance
- Measures effect of feature shuffling on performance
Highly versatile

Image showing the structure of neural network: a collection of hidden layers, each having multiple neurons.

Permutation importance in action

Image showing a dataset of 5 features and a trained ML model.

Permutation importance in action

Image showing the dataset being fed to the ML model to get a baseline performance.

Permutation importance in action

Image showing that one feature is shuffled where in the original dataset its values were [1, 2, 3, 4] and in the shuffled one they became [4, 2, 1, 3]. The shuffled dataset is fed to the ML model to obtain the shuffled performance.

Permutation importance in action

Image showing that the importance of the shuffled feature is proportional to the drop in performance.

Permutation importance in action

Image showing that a significant drop indicates a high importance for that feature, while a small drop suggests lower importance.

Admissions dataset

GRE Score	TOEFL Score	University Rating	SOP	LOR	CGPA	Chance of Admit	Accept
337	118	4	4.5	4.5	9.65	0.92	1
324	107	4	4	4.5	8.87	0.76	1
316	104	3	3	3.5	8	0.72	1
322	110	3	3.5	2.5	8.67	0.8	1
314	103	2	2	3	8.21	0.45	0

The data exists in: X_train, y_train

MLPClassifier

from sklearn.neural_network import MLPClassifier
model = MLPClassifier(hidden_layer_sizes=(10,10))

model.fit(X_train, y_train)

Permutation importance

from sklearn.inspection import permutation_importance

result = permutation_importance(model,

                                X_train, y_train,

                                n_repeats=10,

                                random_state=42,

                                scoring='accuracy')


print(result.importances_mean)

[0.16213568 0.13831658 0.10575377 0.10522613 0.11741206 0.20072864]

Visualizing importance

import matplotlib.pyplot as plt
plt.bar(X_train.columns,
        result.importances_mean)

Bar plot of permutation importances showing that CGPA has highest importance.

Comparison with model-specific approaches

import matplotlib.pyplot as plt
plt.bar(X_train.columns, 
        result.importances_mean)

Bar plot of permutation importances showing that CGPA has highest importance.

Logistic regression

plt.bar(X_train.columns, np.abs(log_reg.coef_[0]))

Bar plot of logistic regression coefficients showing that CGPA has highest importance.

Let's practice!

Explainable AI in Python