Calculate the mean of absolute SHAP values across all classes

Question

Suppose I have the following model, built from this synthetic data.

import numpy as np import shap from sklearn.ensemble import RandomForestClassifier from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split X, y = make_classification(n_samples=1000, n_features=50, n_informative=9, n_redundant=0, n_repeated=0, n_classes=10, n_clusters_per_class=1, class_sep=9, flip_y=0.2, random_state=17) X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42) model = RandomForestClassifier() model.fit(X_train, y_train)

And I calculate features shap values:

explainer = shap.Explainer(model) shap_values = explainer.shap_values(X_test) type(shap_values) list

To calculate each class' SHAP values separately, I do:

abs_sv = np.abs(shap_values) avg_feature_importance_per_class = np.mean(abs_sv, axis=1) avg_feature_importance_per_class.shape (10, 50)

Question

Now, how do I calculate the mean of absolute SHAP values across all classes, which I can consider as the model's feature importance (derived from SHAP values).

I do like this:

feature_importance_overall = np.mean(abs_sv, axis=0)

But then I got myself confused. I am really doing this right? Especially if I look at the shape:

feature_importance_overall.shape (250, 50)

I was expecting something a the shape of (number_of_features_,). Similar to what I get from:

model.feature_importances_.shape (50,)

avg_feature_importance_per_class.shape also shows this but for number_of_classes (i.e. (10, 50)) since this is computed for individual classes separately.

What's the end point of this exercise?

Sergey Bushmanov
– Sergey Bushmanov

2023-11-15 15:56:58 +00:00
Commented Nov 15, 2023 at 15:56 — Sergey Bushmanov
– Sergey Bushmanov, Commented Nov 15, 2023 at 15:56

Simon David · Accepted Answer · 2023-11-15 23:44:37Z

To understand how you need to perform the calculation you mentioned, let's take a look at the shape of shap_values

print(np.array(shap_values).shape) # (10, 250, 50)

This numpy array contains three dimensions, which represent:

10: number of classes
250: number of records in your test data
50: number of features

Thus, to get the desired result of calculating the "mean absolute shap values across all classes", it is required to average this array across the 10 classes (the first dimension at index 0) as well as all 250 records (the second dimension at index 1), which you can do with the following operation:

feature_importance_overall = np.mean(abs_sv, axis=(0, 1)) feature_importance_overall.shape # (50,)

Collectives™ on Stack Overflow

Calculate the mean of absolute SHAP values across all classes

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related