I have a potentially very stupid question, but can't seem to find a solution easily. And i'm pretty new to R, so please forgive my ignorance.
I'm looking for a way to loop through all variables in my dataframe. For instance, to make two-way tables of all variables compared to one specific variable (say, Sex or Educational level). I used to work with Stata, but since R is free, I am now supposed to work with R (I heard there are a plethora of other benefits to working with R as well, so I am very willing to learn :)).
Say, I have 20 variables, of which 15 are answers from a survey and 5 are demographic variables. I would like to see how different answers compare to differences in demographics.
Normally I would tackle the problem above in Stata with something simple as:
for i = 1 to 5 { for j = 1 to 3 { tab Sex Var`i'_`j', chi2 } } making 15 tables, for the variables Var1_1 to Var5_3 vs Sex, and giving a Pearson chi2 statistic.
So, I tried what I thought was the same for R:
for (i in 1:5) { for (j in 1:3){ print(table(chisq.test(paste(df$Sex, "df$Var",i,"_",j,sep="")))) } } but this doesn't work.
Can anyone please point me in the right direction as how to solve this? Any help is highly appreciated!
summary(df)orlapply(df, table), where the first will give you a summary of the data.frame where numerical variables are summarized with min, max, mean, median and categorical (factor) variables with a table. The second gives you a list of tables of your variables.help("$). It explains when you can use$and when to use[]and[[]]instead. In general, approaches that work well in one language do not necessarily transfer well to another language. This is such a case.