12

I have this dataframe:

x <- data.frame( name = rep(letters[1:4], each = 2), condition = rep(c("A", "B"), times = 4), value = c(2,10,4,20,8,40,20,100) ) # name condition value # 1 a A 2 # 2 a B 10 # 3 b A 4 # 4 b B 20 # 5 c A 8 # 6 c B 40 # 7 d A 20 # 8 d B 100 

I want to group by name and divide the value of rows with condition == "B" with those with condition == "A", to get this:

data.frame( name = letters[1:4], value = c(5,5,5,5) ) # name value # 1 a 5 # 2 b 5 # 3 c 5 # 4 d 5 

I know something like this can get me pretty close:

x$value[which(x$condition == "B")]/x$value[which(x$condition == "A")] 

but I was wondering if there was an easy way to do this with dplyr (My dataframe is a toy example and I got to it by chaining multiple group_by and summarise calls).

3 Answers 3

18

Try:

x %>% group_by(name) %>% summarise(value = value[condition == "B"] / value[condition == "A"]) 

Which gives:

#Source: local data frame [4 x 2] # # name value # (fctr) (dbl) #1 a 5 #2 b 5 #3 c 5 #4 d 5 
Sign up to request clarification or add additional context in comments.

2 Comments

I have the same data as above, the only difference is that sometimes column "condition" does not have "A" or "B" all the time, so there's no denominator or numerator sometimes. I want to remove such rows and continue the division. Do you have any idea?
@vicky just filter them up front? x %>% filter(condition %in% c("A", "B"))
6

I'd use spread from tidyr.

library(dplyr) library(tidyr) x %>% spread(condition, value) %>% mutate(value = B/A) name A B value 1 a 2 10 5 2 b 4 20 5 3 c 8 40 5 4 d 20 100 5 

You could then do select(-A, -B) to drop the extra columns.

Comments

4

Using data.table, convert the 'data.frame' to 'data.table' (setDT(x)), grouped by 'name', we divide the 'value' corresponds to 'B' condition by the those that corresponds to 'A' 'condition'.

library(data.table) setDT(x)[,.(value = value[condition=="B"]/value[condition=="A"]) , name] # name value #1: a 5 #2: b 5 #3: c 5 #4: d 5 

Or reshape from 'long' to 'wide' and divide the 'B' column by 'A'.

dcast(setDT(x), name~condition, value.var='value')[, .(name, value = B/A)] 

3 Comments

@user5249203 Perhaps you meant Map or may be you want to divide by something like x[-1]/x[-length(x)]
@user5249203 your comment is not clear to me in the context of this soluiton as here, we are doing a condition check for each row, i.e. the same. Did you meant condtion == 'a', condition == 'b', condition == 'a', 'and so on. IN that case, Map is needed i..e Map(function(x, y) value[condtion == x]/value[condition == y], yourfirstvec_orcol, yoursecondvec)
will try to post a Q.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.