Loop through variable names in R

Question

I have a potentially very stupid question, but can't seem to find a solution easily. And i'm pretty new to R, so please forgive my ignorance.

I'm looking for a way to loop through all variables in my dataframe. For instance, to make two-way tables of all variables compared to one specific variable (say, Sex or Educational level). I used to work with Stata, but since R is free, I am now supposed to work with R (I heard there are a plethora of other benefits to working with R as well, so I am very willing to learn :)).

Say, I have 20 variables, of which 15 are answers from a survey and 5 are demographic variables. I would like to see how different answers compare to differences in demographics.

Normally I would tackle the problem above in Stata with something simple as:

for i = 1 to 5 { for j = 1 to 3 { tab Sex Var`i'_`j', chi2 } }

making 15 tables, for the variables Var1_1 to Var5_3 vs Sex, and giving a Pearson chi2 statistic.

So, I tried what I thought was the same for R:

for (i in 1:5) { for (j in 1:3){ print(table(chisq.test(paste(df$Sex, "df$Var",i,"_",j,sep="")))) } }

but this doesn't work.

Can anyone please point me in the right direction as how to solve this? Any help is highly appreciated!

You can use summary(df) or lapply(df, table), where the first will give you a summary of the data.frame where numerical variables are summarized with min, max, mean, median and categorical (factor) variables with a table. The second gives you a list of tables of your variables. — kath
– kath, Commented Oct 2, 2019 at 8:59
You really need to study help("$). It explains when you can use $ and when to use [] and [[]] instead. In general, approaches that work well in one language do not necessarily transfer well to another language. This is such a case. — Roland
– Roland, Commented Oct 2, 2019 at 9:22
Thanks, I'll read up on that and try again. I also edited my question a bit since my example seems poorly chosen (considering how the first comment answers how to achieve similar results via another way) — Eelco
– Eelco, Commented Oct 2, 2019 at 9:40

Yuriy Barvinchenko · Accepted Answer · 2019-10-02 10:59:39Z

Let's pretend that df is your data and first 15 columns are answers. In this case you can use this

lapply(df[,1:15], function(x) {chisq.test(x, df$Sex)})

Collectives™ on Stack Overflow

Loop through variable names in R

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related