What does many low important feature indicate?

Question

I have a dataset where I am focusing on binary classification problem. In total,I have around 60 features in my dataset

When I used Xgboost Feature Importance, I was able to see that the top 5 features account for 42% whereas rest of the of the 50 features account for 40-49 % (each feature about 1%) and remaing 8-10 features have zero importance or less than 1% of importance.

This is my best paramter list for Xgboost after gridsearch

op_params = {'alpha': [10], 'as_pandas': [True], 'colsample_bytree': [0.5], 'early_stopping_rounds': [100], 'learning_rate': [0.04], 'max_depth': [6], 'metrics': ['auc'], 'num_boost_round': [10000], 'objective': ['reg:logistic'], 'scale_pos_weight': [3.08], 'seed': [123], 'subsample': [0.75]}

Since I have many low importance features, should I try to use them all in my model to increase the model metrics?

When I built the model only with top 5 features, I was able to get 80% accuracy.

I am trying to understand is it even useful to make use of these low importance feature for prediction?

Shown below is my feature importance in descending order

Do they even really help?

Any insights would really be helpful

Noah Weber · Accepted Answer · 2019-12-17 12:27:44Z

1

Its all about a tradeoff.

The more you add unimportant features, the marginal will the benefits get but you risk injecting more complexity and potentially overfitting.

Ocams Razor

Also be carefull with the default feature importance approach. Read this.

answered Dec 17, 2019 at 12:27

Noah Weber

5,8991 gold badge14 silver badges26 bronze badges

$\begingroup$ Appreciate your input. Upvoted $\endgroup$

The Great
– The Great

2019-12-17 12:37:40 +00:00
Commented Dec 17, 2019 at 12:37
$\begingroup$ Hi, I have a question regarding regarding Permutation importance. $\endgroup$

The Great
– The Great

2019-12-17 12:47:58 +00:00
Commented Dec 17, 2019 at 12:47
$\begingroup$ Can you please put it in another question and I will gladly answer it. Also if you are satisfied with answers (generally, not neccesarry mine) dont forget to accept them. $\endgroup$

Noah Weber
– Noah Weber

2019-12-17 12:49:23 +00:00
Commented Dec 17, 2019 at 12:49
$\begingroup$ Hi, in addition does it make sense to add a 'Random number column' and any feature importance below this newly generated random number column can be ignored and rest all can be taken for model prediction. Am I correct in thinking this way? $\endgroup$

The Great
– The Great

2019-12-17 13:12:54 +00:00
Commented Dec 17, 2019 at 13:12
$\begingroup$ I would not do it. Its overkill you could just use scikit-learn.org/stable/modules/generated/… with PermutationImporance packed together $\endgroup$

Noah Weber
– Noah Weber

2019-12-17 13:31:57 +00:00
Commented Dec 17, 2019 at 13:31

Add a comment |

Naresh Mungpara · Accepted Answer · 2019-12-17 12:09:40Z

Adding low-value features might not help you surpass your current accuracy. Getting good quality data and add more data to the dataset or training for more epochs if it doesn't converge might help you get more accuracy.

Since there are lot of low important feature, you think it could be data quality issue? Upvoted for the help. — The Great
– The Great, Commented Dec 17, 2019 at 12:14

Stack Exchange Network

What does many low important feature indicate?

2 Answers 2

Hot Network Questions

What does many low important feature indicate?

2 Answers 2

Related

Hot Network Questions