3

I am a recent convert to R and am struggling to find the R equivalent of the following: looping over variables named with a common prefix plus a number (var1, var2, ..., varn).

Say I have a dataset where each row is a store and each column is the value of that store's revenue in month 1, month 2...month 6. Some made-up data for example:

store = c("a", "b", "c", "d", "c") rev1 = c(500, 200, 600, 400, 1200) rev2 = c(260, 100, 450, 45, 1300) rev3 = c(500, 150, 610, 350, 900) rev4 = c(480, 200, 600, 750, 1000) rev5 = c(500, 68, 750, 350, 1200) rev6 = c(510, 80, 1000, 400, 1450) df = data.frame(store, rev1, rev2, rev3, rev4, rev5, rev6) 

I am trying to do something like the following:

varlist <- paste("rev", 1:6) #create list of variables rev1-rev6 # for i in varlist { highrev[i] <- ifelse(rev[i] > 500, 1, 0) } 

So for each existing variable rev1:rev6, create a variable highrev1:highrev6 which equals 1 if rev1:rev6 > 500 and 0 otherwise.

Can you suggest an appropriate means of doing this?

3 Answers 3

5

In R, we usually don't use loops for such operations. You could simply do:

df[paste0("highrev", 1:6)] <- (df[paste0("rev", 1:6)] > 500) + 0 df # store rev1 rev2 rev3 rev4 rev5 rev6 highrev1 highrev2 highrev3 highrev4 highrev5 highrev6 # 1 a 500 260 500 480 500 510 0 0 0 0 0 1 # 2 b 200 100 150 200 68 80 0 0 0 0 0 0 # 3 c 600 450 610 600 750 1000 1 0 1 1 1 1 # 4 d 400 45 350 750 350 400 0 0 0 1 0 0 # 5 c 1200 1300 900 1000 1200 1450 1 1 1 1 1 1 
Sign up to request clarification or add additional context in comments.

Comments

4

setup

varlist <- paste0("rev",1:6) # note that this is paste0, not paste hvarlist <- paste0("hi",varlist) 

data.table solution. There is a nice way to do this in data.table:

require(data.table) setDT(df)[,(hvarlist):=lapply(.SD,function(x)1L*(x>500)),.SDcols=varlist] # store rev1 rev2 rev3 rev4 rev5 rev6 hirev1 hirev2 hirev3 hirev4 hirev5 hirev6 # 1: a 500 260 500 480 500 510 0 0 0 0 0 1 # 2: b 200 100 150 200 68 80 0 0 0 0 0 0 # 3: c 600 450 610 600 750 1000 1 0 1 1 1 1 # 4: d 400 45 350 750 350 400 0 0 0 1 0 0 # 5: c 1200 1300 900 1000 1200 1450 1 1 1 1 1 1 

The dplyr package is also designed with this sort of operation in mind...but simply cannot do it.


A bad alternative. Here's another way, hewing closely to the OP's loop:

within(df,{for(i in 1:6) assign(hvarlist[i],1L*(get(varlist[i]) > 500));rm(i)}) # store rev1 rev2 rev3 rev4 rev5 rev6 hirev6 hirev5 hirev4 hirev3 hirev2 hirev1 # 1 a 500 260 500 480 500 510 1 0 0 0 0 0 # 2 b 200 100 150 200 68 80 0 0 0 0 0 0 # 3 c 600 450 610 600 750 1000 1 1 1 1 0 1 # 4 d 400 45 350 750 350 400 0 0 1 0 0 0 # 5 c 1200 1300 900 1000 1200 1450 1 1 1 1 1 1 

You can't assign to dynamic variable names with hvarlist[i] <- ...; this is done instead with assign(hvarlist[i],...), but using the latter is not a good habit. Similarly, get must be used to grab a variable on the basis of a string containing its name.

Comments

1

If you want to keep the loop, you could try this

store = c("a", "b", "c", "d", "c") rev1 = c(500, 200, 600, 400, 1200) rev2 = c(260, 100, 450, 45, 1300) rev3 = c(500, 150, 610, 350, 900) rev4 = c(480, 200, 600, 750, 1000) rev5 = c(500, 68, 750, 350, 1200) rev6 = c(510, 80, 1000, 400, 1450) df = data.frame(store, rev1, rev2, rev3, rev4, rev5, rev6) 

You don't need the ifelse like David points out since > is vectorized and will work on the entire data frame

df[, -1] > 500 # rev1 rev2 rev3 rev4 rev5 rev6 # [1,] FALSE FALSE FALSE FALSE FALSE TRUE # [2,] FALSE FALSE FALSE FALSE FALSE FALSE # [3,] TRUE FALSE TRUE TRUE TRUE TRUE # [4,] FALSE FALSE FALSE TRUE FALSE FALSE # [5,] TRUE TRUE TRUE TRUE TRUE TRUE 

Here is your loop slightly amended

for (i in 1:6) { x <- paste0('rev', i) y <- paste0('highrev', i) df[, y] <- (df[, x] > 500) + 0L } # store rev1 rev2 rev3 rev4 rev5 rev6 highrev1 highrev2 highrev3 highrev4 highrev5 highrev6 # 1 a 500 260 500 480 500 510 0 0 0 0 0 1 # 2 b 200 100 150 200 68 80 0 0 0 0 0 0 # 3 c 600 450 610 600 750 1000 1 0 1 1 1 1 # 4 d 400 45 350 750 350 400 0 0 0 1 0 0 # 5 c 1200 1300 900 1000 1200 1450 1 1 1 1 1 1 

2 Comments

@Frank but without it, my answer is not reproducible. I hate it when someone answers a question and I cannot copy/paste to get their result.
Okay, I guess it's a matter of taste; I always run the OP's example before trying questions.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.