Want to do Bootstrapping while comparing two dataframe column wise with the different number of rows.
I have two dataframe in which row represent values from experiments and column with the dataset names (data1, data2, data3, data4)
emp.data1 <- data.frame( data1 = c(234,0,34,0,46,0,0,0,2.26,0, 5,8,93,56), data2 = c(1.40,1.21,0.83,1.379,2.60,9.06,0.88,1.16,0.64,8.28, 5,8,93,56), data3 =c(0,34,43,0,0,56,0,0,0,45,5,8,93,56), data4 =c(45,0,545,34,0,35,0,35,0,534, 5,8,93,56), stringsAsFactors = FALSE ) emp.data2 <- data.frame( data1 = c(45, 0, 0, 45, 45, 53), data2 = c(23, 0, 45, 12, 90, 78), data3 = c(72, 45, 756, 78, 763, 98), data4 = c(1, 3, 65, 78, 9, 45), stringsAsFactors = FALSE ) I am trying to do bootstrapping(n=1000). Values are selected at random replacement from emp.data1(14 * 4) without change in the emp.data2(6 * 4). For example from emp.data2 first column (data1) select 6 values colSum and from emp.data1(data1) select 6 random non zero values colSum Divide the values and store in temp repeat the same 1000 times and take a median value et the end. like this i want to do it for each column of the dataframe. sample code I am providing which is working fine but i am not able get the non-zero random values for emp.data1
nboot <- 1e3 boot_temp_emp<- c() n_data1 <- nrow(emp.data1); n_data2 <- nrow(emp.data2) for (j in seq_len(nboot)) { boot <- sample(x = seq_len(n_data1), size = n_data2, replace = TRUE) value <- colSums(emp.data2)/colSums(emp.data1[boot,]) boot_temp_emp <- rbind(boot_temp_emp, value) } boot_data<- apply(boot_temp_emp, 2, median) From the above script i am able get the output but each column emp.data1[boot,] data has zero values and taken sum. I want indivisual ramdomly selected non-zero values column sum so I tried below script not able remove zero values. Not able get desired output please some one help me to correct my script
nboot <- 1e3 boot_temp_emp<- c() for (i in colnames(emp.data2)){ for (j in seq_len(nboot)){ data1=emp.data1[i] data2=emp.data2[i] n_data1 <- nrow(data1); n_data2 <- nrow(data2) boot <- sample(x = seq_len(n_data1), size = n_data2, replace = TRUE) value <- colSums(data2[i])/colSums(data1[boot, ,drop = FALSE]) boot_temp_emp <- rbind(boot_temp_emp, value) } } boot_data<- apply(boot_temp_emp, 2, median) Thank you

