1

I have a simple multivariate regression model testing the impact of two variables on another over one year (1971). I have a large data set that includes these data over a number of years and am being asked to use "reframe" to rerun the model over 48 years (1961-2018) and then create a scatterplot of the "BB" coefficients by year. I am having a difficult time understanding how to use "reframe" in this context and wondering if anyone can either point me in the right direction, or help me better understand. I will have no issues creating the plot, but I can't figure out the code to rerun the model 48 times and give me a dataframe with the results using "reframe".

fit <- Teams %>% filter(yearID == 1971) %>% mutate(BB = BB/G, HR = HR/G, R = R/G) %>% lm(R ~ BB + HR, data = .) tidy(fit, conf.int = TRUE) 

I have scoured Google and the R help file for "reframe" but still cannot figure out where to even start. I tried to just use the below code, but was given an error and of course I don't even think I'm even on the right track since I cannot figure out how to use "reframe" properly.

fit %>% reframe(Teams$yearID %in% 1961:2018) 

Error in UseMethod("reframe") : no applicable method for 'reframe' applied to an object of class "lm"

5
  • I realize you don't decide this yourself but why don't you include the year in the model? Something like lm(R ~ (BB + HR) * factor(yearID), data = ...). You can even specify different residual variances between years using an nlme:::gls model. Anyway, I believe you want update and not reframe. Commented Jul 22, 2024 at 5:46
  • Try tidy(update(fit, data=filter(data, yearID %in% 1961:2018))) Commented Jul 22, 2024 at 6:08
  • I tried to use update as suggested above, but received the following error message: Error in UseMethod("filter") : no applicable method for 'filter' applied to an object of class "function" I asked the question in the class discussion board and the TA replied that I can also use "summarize" instead of "reframe" but I still just have no idea where to start. I understand the summarize function but do not know how to run 48 regressions and then make a scatterplot of the results other than to run each manually one at a time. Thoughts? Commented Jul 23, 2024 at 2:05
  • I also tried to add year into the model as suggested above and it seems to have worked, though it is also giving me rows with coefficients for just the year. I suppose I can find a way to now filter the results down to just the BBxyear coefficients. This will get me the solution I need, but I wonder if there is a simpler way to tell R to run 48 regressions by varying year. Thanks so much for the help! Commented Jul 23, 2024 at 2:29
  • Yes, there is a simpler way: library(nlme); fits <- lmList(R ~ BB + HR | yearID, data = DF); summary(fits). Commented Jul 23, 2024 at 5:49

1 Answer 1

0

reframe is a twin function of summarize. While the latter returns a single value, the former can return multiple values.

That is, you group your data set aacording to year and returned the tidied result from lm within reframe:

library(dplyr) library(broom) set.seed(20240723) sample_data <- expand.grid(year = 1961:2018, BB = 1:12, HR = runif(10)) %>% as_tibble() %>% mutate(R = year / 100 + 5 * BB + 10 * HR + rnorm(n())) %>% arrange(year) sample_data %>% group_by(year) %>% reframe(lm(R ~ BB + HR, data = .) %>% tidy(conf.int = TRUE)) # # A tibble: 174 × 8 # year term estimate std.error statistic p.value conf.low conf.high # <int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> # 1 1961 (Intercept) 19.9 0.0323 617. 0 19.9 20.0 # 2 1961 BB 5.00 0.00353 1417. 0 4.99 5.01 # 3 1961 HR 9.96 0.0420 237. 0 9.88 10.0 # 4 1962 (Intercept) 19.9 0.0323 617. 0 19.9 20.0 # 5 1962 BB 5.00 0.00353 1417. 0 4.99 5.01 # 6 1962 HR 9.96 0.0420 237. 0 9.88 10.0 # 7 1963 (Intercept) 19.9 0.0323 617. 0 19.9 20.0 # 8 1963 BB 5.00 0.00353 1417. 0 4.99 5.01 # 9 1963 HR 9.96 0.0420 237. 0 9.88 10.0 # 10 1964 (Intercept) 19.9 0.0323 617. 0 19.9 20.0 # # ℹ 164 more rows # # ℹ Use `print(n = ...)` to see more rows 

What you get is a tibble where you have one row for each parameter of the model for each year. You probably need to further pivot the data to do your plotting, but this should get you started.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.