I want to loop through the vars in a dataframe, calling lm() on each one, and so I wrote this:
findvars <- function(x = samsungData, dv = 'activity', id = 'subject') { # Loops through the possible predictor vars, does an lm() predicting the dv # from each, and returns a data.frame of coefficients, one row per IV. r <- data.frame() # All varnames apart from the dependent var, and the case identifier ivs <- setdiff(names(x), c(dv, id)) for (iv in ivs) { print(paste("trying", iv)) m <- lm(dv ~ iv, data = x, na.rm = TRUE) # Take the absolute value of the coefficient, then transpose. c <- t(as.data.frame(sapply(m$coefficients, abs))) c$iv <- iv # which IV produced this row? r <- c(r, c) } return(r) } This doesn't work, I believe b/c the formula in the lm() call consists of function-local variables that hold strings naming vars in the passed-in dataframe (e.g., "my_dependant_var" and "this_iv") as opposed to pointers to the actual variable objects.
I tried wrapping that formula in eval(parse(text = )), but could not get that to work.
If I'm right about the problem, can someone explain to me how to get R to resolve the contents of those vars iv & dv into the pointers I need? Or if I'm wrong, can someone explain what else is going on?
Many thanks!
Here is some repro code:
library(datasets) data(USJudgeRatings) findvars(x = USJudgeRatings, dv = 'CONT', id = 'DILG')
?reformulate(you can search SO for that keyword)