How do I reiterate through multiple variables

Question

I have a sample dataset as below:

Day<-c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2) Group<-c("A","A","A","B","B","B","C","C","C","A","A","A","A","B","B","B","C","C","C") Rain<-c(4,4,6,5,3,4,5,5,3,6,6,6,5,3,3,3,2,5,2) UV<-c(6,6,7,8,5,6,5,6,6,6,7,7,8,8,5,6,8,5,7) dat<-data.frame(Day,Group,Rain,UV)

I want to run a Kruskal Wallis test among 'A','B' and 'C' in "Group" for the variables "Rain" and "UV". At present, I am subsetting the variables one by one for Kruskal test as below:

dat_Rain<-dat%>%select(c(Day,Group,Rain)) library(rstatix) library(tidyverse) dat_Rain%>% group_by(Day) %>% kruskal_test(Rain ~ Group)

How do I reiterate Kruskal test for multiple variables (Rain,UV) in this dataset? Thanks.

You could reshape to long format and use a second grouping variable. — Roland
– Roland, Commented Jan 6, 2021 at 11:59

Ronak Shah · Accepted Answer · 2021-01-06 11:52:38Z

You can define the columns that you want to apply kruskal_test and use map_df to get all the values in one dataframe.

library(rstatix) library(tidyverse) cols <- c('Rain', 'UV') map_df(cols, ~dat %>% group_by(Day) %>% kruskal_test(reformulate('Group', .x))) # Day .y. n statistic df p method # <dbl> <chr> <int> <dbl> <int> <dbl> <chr> #1 1 Rain 9 0.505 2 0.777 Kruskal-Wallis #2 2 Rain 10 6.52 2 0.0384 Kruskal-Wallis #3 1 UV 9 1.16 2 0.56 Kruskal-Wallis #4 2 UV 10 0.423 2 0.809 Kruskal-Wallis

stefan · Accepted Answer · 2021-01-06 11:50:34Z

Using lapply and making use of a helper function this could be achieved like so:

Additionally I made use of bind_rows to bind the resulting list into one data frame.

Day<-c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2) Group<-c("A","A","A","B","B","B","C","C","C","A","A","A","A","B","B","B","C","C","C") Rain<-c(4,4,6,5,3,4,5,5,3,6,6,6,5,3,3,3,2,5,2) UV<-c(6,6,7,8,5,6,5,6,6,6,7,7,8,8,5,6,8,5,7) dat<-data.frame(Day,Group,Rain,UV) library(rstatix) library(tidyverse) kt <- function(x, data) { fmla <- as.formula(paste(x, "~ Group")) data %>% group_by(Day) %>% kruskal_test(fmla) } lapply(c("Rain", "UV"), kt, data = dat) %>% bind_rows() #> # A tibble: 4 x 7 #> Day .y. n statistic df p method #> <dbl> <chr> <int> <dbl> <int> <dbl> <chr> #> 1 1 Rain 9 0.505 2 0.777 Kruskal-Wallis #> 2 2 Rain 10 6.52 2 0.0384 Kruskal-Wallis #> 3 1 UV 9 1.16 2 0.56 Kruskal-Wallis #> 4 2 UV 10 0.423 2 0.809 Kruskal-Wallis

Collectives™ on Stack Overflow

How do I reiterate through multiple variables

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related