0

My data is contained in a data.frame:

SYMBOL variable value Sample IDs Group TLR8 MMRF_2613_1_BM 3.186233 Baseline 2613 LessUp TLR8 MMRF_2613_1_BM 5.471014 Baseline 2613 LessUp TLR8 MMRF_2613_1_BM 2.917965 Baseline 2613 MostUp TLR8 MMRF_2613_1_BM 2.147028 Baseline 2613 MostUp TLR4 MMRF_2613_1_BM 7.497424 Baseline 2613 LessUp TLR4 MMRF_2613_1_BM 4.16523 Baseline 2613 LessUp TLR4 MMRF_2613_1_BM 7.136523 Baseline 2613 MostUp TLR4 MMRF_2613_1_BM 7.96523 Baseline 2613 MostUp 

For each SYMBOL, I would like to divide the sum of value for the rows where Group is "MostUp" by the sum of value for "LessUp" rows.

I believe I could use the group_by function, but I am not sure how to apply it correctly.

Here is an example of my expected output.

SYMBOL variable value Sample IDs Group TLR8 MMRF_2613_1_BM 0.58 Baseline 2613 MostUp_divided_by_LessUp TLR4 MMRF_2613_1_BM 1.29 Baseline 2613 MostUp_divided_by_LessUp 

In addition to calculating the ratios, how would I perform a T-test between the groups?

3
  • 1
    Can you explain how do you get 0.58 and 1.29 as value ? Do you divide them individually then take mean or sum them and divide? Commented Apr 9, 2020 at 8:49
  • > (2.147028+2.917965)/(5.471014+3.186233) [1] 0.5850582 > (7.96523+7.136523)/(4.16523+7.497424) [1] 1.294881 Commented Apr 9, 2020 at 8:52
  • Sum, then divide. Commented Apr 9, 2020 at 8:53

1 Answer 1

2

We could first calculate the sum of each Group for each Symbol and then divide within each other based on value of 'MostUp' and 'LessUp'.

library(dplyr) df %>% group_by(SYMBOL, variable, Sample, IDs, Group) %>% summarise(value = sum(value)) %>% summarise(value = value[Group == 'MostUp']/value[Group == 'LessUp']) # SYMBOL variable Sample IDs value # <fct> <fct> <fct> <int> <dbl> #1 TLR4 MMRF_2613_1_BM Baseline 2613 1.29 #2 TLR8 MMRF_2613_1_BM Baseline 2613 0.585 

To calculate t.test between groups we can do :

df1 <- df %>% group_by(SYMBOL, variable, Sample, IDs) %>% summarise(value = list(t.test(value[Group == 'MostUp'], value[Group == 'LessUp']))) df1 # A tibble: 2 x 5 # Groups: SYMBOL, variable, Sample [2] # SYMBOL variable Sample IDs value # <fct> <fct> <fct> <int> <list> #1 TLR4 MMRF_2613_1_BM Baseline 2613 <htest> #2 TLR8 MMRF_2613_1_BM Baseline 2613 <htest> 

data

df <- structure(list(SYMBOL = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), .Label = c("TLR4", "TLR8"), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "MMRF_2613_1_BM", class = "factor"), value = c(3.186233, 5.471014, 2.917965, 2.147028, 7.497424, 4.16523, 7.136523, 7.96523), Sample = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Baseline", class = "factor"), IDs = c(2613L, 2613L, 2613L, 2613L, 2613L, 2613L, 2613L, 2613L), Group = structure(c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L ), .Label = c("LessUp", "MostUp"), class = "factor")), class = "data.frame", row.names = c(NA, -8L)) 
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks. What would you do if you should calculate a t-test between the same groups?
@user2300940 you mean something like this? df %>% group_by(SYMBOL, variable, Sample, IDs) %>% summarise(value = list(t.test(value[Group == 'MostUp'], value[Group == 'LessUp'])))
Yes, however, this seems to give an empty column
what does ttest and htest mean in the output?
t.test returns output of class 'htest'. Check ?t.test for more details. It is mentioned in the Value section about what values does t.test returns.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.