0

I have trained my imbalanced dataset (binary classification) using CatboostClassifer. Now, I am trying to interpret the model using the SHAP library. Below is the code to fit the model and calculate shap values:

weights = y.value_counts()[0] / y.value_counts()[1] catboost_clf = CatBoostClassifier(loss_function='Logloss', iterations=100, verbose=True, \ l2_leaf_reg=6, scale_pos_weight=weights,eval_metric="MCC") catboost_clf.fit(X, y) trainx_preds = catboost_clf.predict(X_test) explainer = shap.TreeExplainer(catboost_clf) shap_values = explainer.shap_values(Pool(X,y)) #Class 0 samples 1625125 #Class 1 samples 122235 

The size of shap values is (1747360, 13) i.e. (number of instances, number of features). I was expecting the shap values to be a 3d array i.e. (number of classes,number of instances, number of features). Shap values for each of the positive and negative class. How do I achieve that? How do I extract class wise shapley values to better understanding of the model.

Also, explainer.expected_value shows one base value instead of two.

Is there anything missing or incorrect in the code?

Thanks in advance!

1 Answer 1

0

Adding 'Multicalss' to the loss_function solved the problem. Referred to the documentation: Catboost

model = CatBoostClassifier(loss_function = 'MultiClass') 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.