I was trying to use Shapley value approach for understanding the model predictions. I am trying this on a Xgboost model. My plot looks like as below
Can someone help me interpret this? Or confirm my understanding is correct?
My interpretation
1) High values of Feature 5 (indicated by rose/purple combination) - leads to prediction 1
2) Low values of Feature 5 (indicated by blue) - leads to prediction 0
3) Step 1 and 2 applies for Feature 1 as well
4) Low values of Feature 6 leads to prediction 1 and high values of Feature 6 leads to Prediction 0
5) Low values of Feature 8 leads to prediction 1 and high values of Feature 8 leads to prediction 1 as well. If it's too the extreme of x-axis (meaning from x(1,2) or x(2,3) - it means the impact of low values (in this case) of this feature, has a huge impact on the prediction 1. Am I right?
6) Why don't I see all my 45 features in the plot irrespective of the importance/influence. Shouldn't I be seeing no color when they have no importance. Why is that I only see around 12-14 features?
7) What role does Feature 43,Feature 55, Feature 14 play in prediction output?
8) Why is the SHAP value range from -2,2?
Can someone help me with this?
