Variable importance from GLMNET

Question

I am looking at using the lasso as a method for selecting features and fitting a predictive model with a binary target. Below is some code I was playing with to try out the method with regularized logistic regression.

My question is I get a group of "significant" variables but am I able to rank order these to estimate relative importance of each? Can the coefficients be standardized for this purpose of rank by absolute value (I understand that they are shown on the original variable scale through the coef function)? If so, how to do so (using the standard deviation of x and y) Standardize Regression Coefficients.

SAMPLE CODE:

 library(glmnet) #data comes from #http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic) datasetTest <- read.csv('C:/Documents and Settings/E997608/Desktop/wdbc.data.txt',head=FALSE) #appears to use the first level as the target success datasetTest$V2<-as.factor(ifelse(as.character(datasetTest$V2)=="M","0","1")) #cross validation to find optimal lambda #using the lasso because alpha=1 cv.result<-cv.glmnet( x=as.matrix(dataset[,3:ncol(datasetTest)]), y=datasetTest[,2], family="binomial", nfolds=10, type.measure="deviance", alpha=1 ) #values of lambda used histogram(cv.result$lambda) #plot of the error measure (here was deviance) #as a CI from each of the 10 folds #for each value of lambda (log actually) plot(cv.result) #the mean cross validation error (one for each of the #100 values of lambda cv.result$cvm #the value of lambda that minimzes the error measure #result: 0.001909601 cv.result$lambda.min log(cv.result$lambda.min) #the value of lambda that minimzes the error measure #within 1 SE of the minimum #result: 0.007024236 cv.result$lambda.1se #the full sequence was fit in the object called cv.result$glmnet.fit #this is same as a call to it directly. #here are the coefficients from the min lambda coef(cv.result$glmnet.fit,s=cv.result$lambda.1se)

Yevgeny · Accepted Answer · 2011-09-01 19:13:11Z

17

As far as I know glmnet does not calculate the standard errors of regression coefficients (since it fits model parameters using cyclic coordinate descent). So, if you need standardized regression coefficients, you will need to use some other method (e.g. glm)

Having said that, if the explanatory variables are standardized before the fit and glmnet is called with "standardize=FALSE", then the less important coefficients will be smaller than the more important ones - so you could rank them just by their magnitude. This becomes even more pronounced with non-trivial amount shrinkage (i.e. non-zero lambda)

Hope this helps..

edited Sep 1, 2011 at 19:13

answered Sep 1, 2011 at 16:27

Yevgeny

1,44012 silver badges11 bronze badges

2

$\begingroup$ thanks. I believe the coeff are returned back on the original scale. So one would need to re scale them (I assume by using the technique I posted for example). $\endgroup$

B_Miner
– B_Miner

2011-09-01 18:18:43 +00:00
Commented Sep 1, 2011 at 18:18
$\begingroup$ user6129 is right! you don't get any means of ranking the variables selected. It's an active area of research. $\endgroup$

suncoolsu
– suncoolsu

2011-09-01 18:31:51 +00:00
Commented Sep 1, 2011 at 18:31
4

$\begingroup$ @B_Miner: you are right, if called with "standardize=TRUE" glmnet returns coefficients on the original scale. One way to get around that is to standardize the explanatory variables outside (e.g. using "scale()" function) and call glmnet with "standardize=FALSE". The resulting coefficients could then be ranked by magnitude to judge their importance. $\endgroup$

Yevgeny
– Yevgeny

2011-09-01 18:57:02 +00:00
Commented Sep 1, 2011 at 18:57
$\begingroup$ @suncoolsu: pls see my updated answer above $\endgroup$

Yevgeny
– Yevgeny

2011-09-01 19:14:10 +00:00
Commented Sep 1, 2011 at 19:14
1

$\begingroup$ @Yevgeny I have a question. Then technically, should the performance results (e.g. area under the curve) be the same whether we set 'standardize=FALSE' and standardize the variables ourselves or we just use 'standardize=TRUE'? (Only the beta-coefficients returned would be different). This is what I theoretically think, but in practice, I get slightly better results when I use 'standardize=TRUE'. Hence, both the coefficients and performance are different. Is this how it should be? $\endgroup$

Michelle
– Michelle

2017-09-11 06:21:22 +00:00
Commented Sep 11, 2017 at 6:21

Add a comment |

Antoine Lizée · Accepted Answer · 2018-09-25 14:47:50Z

9

To get the coefficient in a space that lets you directly compare their importance, you have to standardize them. I wrote a note on Thinklab to discuss standardization of logistic regression coefficients.

(Very) Long story short, I advise to use the Agresti method:

# if X is the input matrix of the glmnet function, # and cv.result is your glmnet object: sds <- apply(X, 2, sd) cs <- as.matrix(coef(cv.result, s = "lambda.min")) std_coefs <- coefs[-1, 1] * sds

If you relied on internal standardization by glmnet (default option standardize = TRUE), these standardized coefficients are actually the ones resulting from the fitting step, before retransformation by glmnet in the original space (see another note :-) ).

edited Sep 25, 2018 at 14:47

answered May 8, 2016 at 0:05

Antoine Lizée

3113 silver badges5 bronze badges

2

$\begingroup$ I think your last line should be std_coefs <- coefs[-1, 1] * sds. This corresponds with your note which says $$b^* = b \cdot \sigma_x$$ for the Agresti method. I find this counter-intuitive but correct. The non-standardized coefficient is the amount of change in the result for a unit change in the predictor. Normalized coeffcifient is the change in the result for a 1-standard-deviation change in the predictor; to get this, you must multiply by the SD. $\endgroup$

Kent Johnson
– Kent Johnson

2017-02-03 14:01:07 +00:00
Commented Feb 3, 2017 at 14:01
1

$\begingroup$ Antoine - Can you confirm that multiplication and not division is proper here? $\endgroup$

B_Miner
– B_Miner

2017-09-09 13:35:25 +00:00
Commented Sep 9, 2017 at 13:35
1

$\begingroup$ Indeed, you multiply the coefficient by $\sigma_x$. The linear score is of the form $\dots + b \cdot x+\dots = \dots + (b\cdot \sigma_x) \cdot (x-\mu)/\sigma_x + \dots $, i.e.: $b \cdot \sigma_x = $ coefficient of standardized $x$. $\endgroup$

VictorZurkowski
– VictorZurkowski

2018-08-02 21:59:42 +00:00
Commented Aug 2, 2018 at 21:59
$\begingroup$ Yes, it's a typo ( Yet another reminder to never type examples without running the code ;-) ) Thanks for catching it, it's fixed. $\endgroup$

Antoine Lizée
– Antoine Lizée

2018-09-25 14:48:42 +00:00
Commented Sep 25, 2018 at 14:48
1

$\begingroup$ I think std_coefs <- coefs[-1, 1] * sds should be std_coefs <- cs[-1, 1] * sds, no? $\endgroup$

Christopher John
– Christopher John

2019-11-05 08:50:43 +00:00
Commented Nov 5, 2019 at 8:50

| Show 2 more comments

Stack Exchange Network

Variable importance from GLMNET

2 Answers 2

Linked

Hot Network Questions

Variable importance from GLMNET

2 Answers 2

Linked

Related

Hot Network Questions