1

Grouped regression is running well model1 with "do". But recently, it is told that do is superseded and suggested to use "across" but no example is given in the help file. Model2 is given in "do" help, and it is running well without "map" or "across". I don't understand how the regression is looping over those groups without map. When I tried using map in model3, I am getting errors. Model4 is given in Hadley's book, R for data science using split and working well. How to tell map function to consider the list "data". Any suggestions?

library(purrr) #> Warning: package 'purrr' was built under R version 3.6.3 library(tidyverse) #> Warning: package 'tidyverse' was built under R version 3.6.3 #> Warning: package 'ggplot2' was built under R version 3.6.3 #> Warning: package 'tidyr' was built under R version 3.6.3 #> Warning: package 'dplyr' was built under R version 3.6.3 #> Warning: package 'stringr' was built under R version 3.6.3 #> Warning: package 'forcats' was built under R version 3.6.3 model1 = mtcars %>% group_by(cyl) %>% do(mod = lm(mpg ~ disp, data = .)) model1 #> # A tibble: 3 x 2 #> # Rowwise: #> cyl mod #> <dbl> <list> #> 1 4 <lm> #> 2 6 <lm> #> 3 8 <lm> ## from "do" help file model2 = mtcars %>% nest_by(cyl) %>% mutate(mod = list(lm(mpg ~ disp, data = data))) model2 #> # A tibble: 3 x 3 #> # Rowwise: cyl #> cyl data mod #> <dbl> <list<tbl_df[,10]>> <list> #> 1 4 [11 x 10] <lm> #> 2 6 [7 x 10] <lm> #> 3 8 [14 x 10] <lm> ## using map model3 = mtcars %>% nest_by(cyl) %>% mutate(fit = map(data, ~lm(mpg ~ disp, data = .))) #> Error: Problem with `mutate()` input `fit`. #> x numeric 'envir' arg not of length one #> i Input `fit` is `map(data, ~lm(mpg ~ disp, data = .))`. #> i The error occured in row 1. ##model4 model4 = mtcars %>% split(.$cyl) %>% map(~lm(mpg ~ disp, data = .)) model4 #> $`4` #> #> Call: #> lm(formula = mpg ~ disp, data = .) #> #> Coefficients: #> (Intercept) disp #> 40.8720 -0.1351 #> #> #> $`6` #> #> Call: #> lm(formula = mpg ~ disp, data = .) #> #> Coefficients: #> (Intercept) disp #> 19.081987 0.003605 #> #> #> $`8` #> #> Call: #> lm(formula = mpg ~ disp, data = .) #> #> Coefficients: #> (Intercept) disp #> 22.03280 -0.01963 

Created on 2020-08-02 by the reprex package (v0.3.0)

1 Answer 1

0

It could be an issue with rowwise attribute, we could ungroup

library(dplyr) library(purrr) mtcars %>% nest_by(cyl) %>% # // creates the rowwise attribute ungroup %>% # // remove the rowwise mutate(fit = map(data, ~lm(mpg ~ disp, data = .))) # A tibble: 3 x 3 # cyl data fit # <dbl> <list<tbl_df[,10]>> <list> #1 4 [11 × 10] <lm> #2 6 [7 × 10] <lm> #3 8 [14 × 10] <lm> 
Sign up to request clarification or add additional context in comments.

6 Comments

Thank you for your answer. But can you provide more explanation for why model2 is working and model4 is not working. How do we know that ungroup should be used? It is not intuitive. Also, how model2 is working without looping?
@RamakrishnaS model4 is splitting into a list and there is no grouping attribute. If you do the same mtcars %>% group_split(cyl) %>% map(~ lm(mpg ~ disp, data = .)) I guess you meant model3 instead of model4?
Yes, I mean model3. Please explain, why should ungroup is used. I am not getting it right. Also, in model2 why lm should be enclosed in list().
@RamakrishnaS in model2, you are enclosing lm in a list because the model object is not a regular object. It is a list of lot of components (please check the str) and if you don't wrap it in a list, it cannot be self-contained. The reason is that nest_by is adding an extra attribute rowwise, while group_by nest doesn't i.e. mtcars %>% group_by(cyl) %>% nest %>% mutate(fit = map(data, ~lm(mpg ~ disp, data = .))) works
@RamakrishnaS regarding why it is behaving differently in nest_by, the attribute added may be a bug and it could get corrected in the next release.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.