Big picture: I want my user defined function to iterate through a list (or vector) of arguments like a loop. (In this case the argument is a character string)
get_avg2 <- function(v_name) { avg <- "_Average" data_1 <- PFF_College_Defense_data %>% dplyr::group_by(Name) %>% dplyr::summarise("{{ v_name }}_{avg}" := mean({{ v_name }}, na.rm = TRUE)) PFF_NCAA_Average_grades <- merge(PFF_NCAA_Average_grades, data_1, by = "Name") return(PFF_NCAA_Average_grades) } v_names <- list("hits", "tackles", "forced_fumbles") for (i in v_names) { get_avg2(i) } #didn't work PFF_NCAA_Average_grades <- purrr::map_df(v_names, get_avg2) #didnt' work I am trying to get averages by group from a dataframe and store it as another dataframe. I have written a UDF to accept one argument as the variable name from the original database, then the UDF runs the calculation and merges it to the created dataframe which I pre-formatted to fit the results of the UDF. I want to pass a list to my function and have it iterate over that list like a loop. Although I just can't conceptually seem to master this concept or the use of purrr::map which I thought would do the trick.
I know I can do this:
PFF_NCAA_Average_grades <- get_avg2(hits) PFF_NCAA_Average_grades <- get_avg2(tackles) PFF_NCAA_Average_grades <- get_avg2(forced_fumbles) But that seems ugly and slow. Can someone please help me conceptually understand the best way to do this?
Thanks in advance!!!
*** UPDATED WITH REPREX ******
library(tidyverse) data_sample <- data.frame( Name = c("Dalton Campbell", "Dalton Campbell", "Dalton Campbell", "Andre Walker", "Andre Walker", "Andre Walker"), Defense_Grade = c(88, 86, 92, 94, 97, 95), Tackle_Grade = c(66, 69, 72, 74, 76, 78), Coverage_Grade = c(44, 43, 44, 76, 73, 78) ) #Here I set up the dataframe which the function will bind to data_sample_averages <- data_sample %>% group_by(Name) %>% dplyr::summarise(Defense_Grade_Average = mean(Defense_Grade)) #> `summarise()` ungrouping output (override with `.groups` argument) #Function which computes average of variable (the only argument) and merges it back to data_sample_averages get_avg2 <- function(v_name) { avg <- "_Average" data_1 <- data_sample %>% dplyr::group_by(Name) %>% dplyr::summarise("{{ v_name }}_{avg}" := mean({{ v_name }}, na.rm = TRUE)) data_sample_averages <- merge(data_sample_averages, data_1, by = "Name") return(data_sample_averages) } #This works - it computers the average of Tackle_Grade and binds it to data_sample_averages data_sample_averages <- get_avg2(Tackle_Grade) #> `summarise()` ungrouping output (override with `.groups` argument) #shows you the averages print(data_sample_averages) #> Name Defense_Grade_Average Tackle_Grade__Average #> 1 Andre Walker 95.33333 76 #> 2 Dalton Campbell 88.66667 69 #Neither of these work - this is where I'm stuck variable_list <- list("Defense_Grade", "Tackle_Grade", "Coverage Grade") data_sample_averages <- lapply(variable_list, get_avg2) #> Warning in mean.default(~"Defense_Grade", na.rm = TRUE): argument is not numeric #> or logical: returning NA #> Warning in mean.default(~"Defense_Grade", na.rm = TRUE): argument is not numeric #> or logical: returning NA #> `summarise()` ungrouping output (override with `.groups` argument) #> Warning in mean.default(~"Tackle_Grade", na.rm = TRUE): argument is not numeric #> or logical: returning NA #> Warning in mean.default(~"Tackle_Grade", na.rm = TRUE): argument is not numeric #> or logical: returning NA #> `summarise()` ungrouping output (override with `.groups` argument) #> Warning in mean.default(~"Coverage Grade", na.rm = TRUE): argument is not #> numeric or logical: returning NA #> Warning in mean.default(~"Coverage Grade", na.rm = TRUE): argument is not #> numeric or logical: returning NA #> `summarise()` ungrouping output (override with `.groups` argument) data_sample_averages <- purrr::map(variable_list, get_avg2) #> Warning in mean.default(~"Defense_Grade", na.rm = TRUE): argument is not numeric #> or logical: returning NA #> Warning in mean.default(~"Defense_Grade", na.rm = TRUE): argument is not numeric #> or logical: returning NA #> `summarise()` ungrouping output (override with `.groups` argument) #> Warning in mean.default(~"Tackle_Grade", na.rm = TRUE): argument is not numeric #> or logical: returning NA #> Warning in mean.default(~"Tackle_Grade", na.rm = TRUE): argument is not numeric #> or logical: returning NA #> `summarise()` ungrouping output (override with `.groups` argument) #> Warning in mean.default(~"Coverage Grade", na.rm = TRUE): argument is not #> numeric or logical: returning NA #> Warning in mean.default(~"Coverage Grade", na.rm = TRUE): argument is not #> numeric or logical: returning NA #> `summarise()` ungrouping output (override with `.groups` argument) This feels like a really simple operation - compute the mean by group from one dataframe and bind it to another dataframe - that is not really the part I'm struggling with. What I want is for my function to iterate through a series of arguments automatically. I want to quickly be able to build a list (or vector - I'm not set on using lists) of variables and pass it to the function as the argument so it builds a dataframe with the variables I feed it. But I'm open to the idea that I have something conceptually wrong and that I should be using a loop, purr, map, etc. or change the way my function is written?
unlist?avedoes.