Understand shap values for binary classification

Question

I have trained my imbalanced dataset (binary classification) using CatboostClassifer. Now, I am trying to interpret the model using the SHAP library. Below is the code to fit the model and calculate shap values:

weights = y.value_counts()[0] / y.value_counts()[1] catboost_clf = CatBoostClassifier(loss_function='Logloss', iterations=100, verbose=True, \ l2_leaf_reg=6, scale_pos_weight=weights,eval_metric="MCC") catboost_clf.fit(X, y) trainx_preds = catboost_clf.predict(X_test) explainer = shap.TreeExplainer(catboost_clf) shap_values = explainer.shap_values(Pool(X,y)) #Class 0 samples 1625125 #Class 1 samples 122235

The size of shap values is (1747360, 13) i.e. (number of instances, number of features). I was expecting the shap values to be a 3d array i.e. (number of classes,number of instances, number of features). Shap values for each of the positive and negative class. How do I achieve that? How do I extract class wise shapley values to better understanding of the model.

Also, explainer.expected_value shows one base value instead of two.

Is there anything missing or incorrect in the code?

Thanks in advance!

Dhvani Shah · Accepted Answer · 2023-01-12 02:25:35Z

Adding 'Multicalss' to the loss_function solved the problem. Referred to the documentation: Catboost

model = CatBoostClassifier(loss_function = 'MultiClass')

Collectives™ on Stack Overflow

Understand shap values for binary classification

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related