Skip to main content
Became Hot Network Question
added 51 characters in body
Source Link
The Great
  • 2.8k
  • 3
  • 24
  • 49

I was trying to use Shapley value approach for understanding the model predictions. I am trying this on a Xgboost model. My plot looks like as below

enter image description here

Can someone help me interpret this? Or confirm my understanding is correct?

My interpretation

1) High values of Feature 5 (indicated by rose/purple combination) - leads to prediction 1

2) Low values of Feature 5 (indicated by blue) - leads to prediction 0

3) Step 1 and 2 applies for Feature 1 as well

4) Low values of Feature 6 leads to prediction 1 and high values of Feature 6 leads to Prediction 0

5) Low values of Feature 8 leads to prediction 1 and high values of Feature 8 leads to prediction 1 as well. If it's too the extreme of x-axis (meaning from x(1,2) or x(2,3) - it means the impact of low values (in this case) of this feature, has a huge impact on the prediction 1. Am I right?

6) Why don't I see all my 45 features in the plot irrespective of the importance/influence. Shouldn't I be seeing no color when they have no importance. Why is that I only see around 12-14 features?

7) What role does Feature 43,Feature 55, Feature 14 play in prediction output?

8) Why is the SHAP value range from -2,2?

Can someone help me with this?

I was trying to use Shapley value approach for understanding the model predictions. I am trying this on a Xgboost model. My plot looks like as below

enter image description here

Can someone help me interpret this? Or confirm my understanding is correct?

My interpretation

1) High values of Feature 5 (indicated by rose/purple combination) - leads to prediction 1

2) Low values of Feature 5 (indicated by blue) - leads to prediction 0

3) Step 1 and 2 applies for Feature 1 as well

4) Low values of Feature 6 leads to prediction 1 and high values of Feature 6 leads to Prediction 0

5) Low values of Feature 8 leads to prediction 1 and high values of Feature 8 leads to prediction 1 as well. If it's too the extreme of x-axis (meaning from x(1,2) or x(2,3) - it means the impact of low values (in this case) of this feature, has a huge impact on the prediction 1. Am I right?

6) Why don't I see all my 45 features in the plot irrespective of the importance/influence. Shouldn't I be seeing no color when they have no importance. Why is that I only see around 12-14 features?

7) What role does Feature 43,Feature 55, Feature 14 play in prediction output?

Can someone help me with this?

I was trying to use Shapley value approach for understanding the model predictions. I am trying this on a Xgboost model. My plot looks like as below

enter image description here

Can someone help me interpret this? Or confirm my understanding is correct?

My interpretation

1) High values of Feature 5 (indicated by rose/purple combination) - leads to prediction 1

2) Low values of Feature 5 (indicated by blue) - leads to prediction 0

3) Step 1 and 2 applies for Feature 1 as well

4) Low values of Feature 6 leads to prediction 1 and high values of Feature 6 leads to Prediction 0

5) Low values of Feature 8 leads to prediction 1 and high values of Feature 8 leads to prediction 1 as well. If it's too the extreme of x-axis (meaning from x(1,2) or x(2,3) - it means the impact of low values (in this case) of this feature, has a huge impact on the prediction 1. Am I right?

6) Why don't I see all my 45 features in the plot irrespective of the importance/influence. Shouldn't I be seeing no color when they have no importance. Why is that I only see around 12-14 features?

7) What role does Feature 43,Feature 55, Feature 14 play in prediction output?

8) Why is the SHAP value range from -2,2?

Can someone help me with this?

Source Link
The Great
  • 2.8k
  • 3
  • 24
  • 49

How to interpret Shapley value plot for a model?

I was trying to use Shapley value approach for understanding the model predictions. I am trying this on a Xgboost model. My plot looks like as below

enter image description here

Can someone help me interpret this? Or confirm my understanding is correct?

My interpretation

1) High values of Feature 5 (indicated by rose/purple combination) - leads to prediction 1

2) Low values of Feature 5 (indicated by blue) - leads to prediction 0

3) Step 1 and 2 applies for Feature 1 as well

4) Low values of Feature 6 leads to prediction 1 and high values of Feature 6 leads to Prediction 0

5) Low values of Feature 8 leads to prediction 1 and high values of Feature 8 leads to prediction 1 as well. If it's too the extreme of x-axis (meaning from x(1,2) or x(2,3) - it means the impact of low values (in this case) of this feature, has a huge impact on the prediction 1. Am I right?

6) Why don't I see all my 45 features in the plot irrespective of the importance/influence. Shouldn't I be seeing no color when they have no importance. Why is that I only see around 12-14 features?

7) What role does Feature 43,Feature 55, Feature 14 play in prediction output?

Can someone help me with this?