Using the example by @zx8754
M <- matrix(c(1,1,0,0,1,0,1,1,1,1,1,1), 4)
we can define an auxiliary matrix that contains the row and column indices of the entries equal to 1:
oneMat <- which(M==1, arr.ind=TRUE)
From this auxiliary matrix we can create a list that contains the column numbers that are equal to one in each row with
oneList <- lapply(1:nrow(M), function(x) oneMat[oneMat[,1] == x, 2]) #[[1]] #[1] 1 2 3 # #[[2]] #[1] 1 3 # #[[3]] #[1] 2 3 # #[[4]] #[1] 2 3
If the matrix M is large and sparse, the matrix oneMat should be much smaller than M. In that case I think that the lapply() loop used in the second step could lead to a speedup with respect to the for loop described in the OP.
After some tests, I regretfully have to admit that this answer is particularly slow. The solution by @ColonelBeauvel is the winner:
j <- list() set.seed(123) M <- matrix(rbinom(1e5,1,0.01),ncol=100) library(microbenchmark) f_which_and_lappy <- function(x) {oneMat <- which(x==1, arr.ind=TRUE); lapply(1:nrow(x), function(i) oneMat[oneMat[,1] == i, 2])} f_only_apply <- function(x) {apply(x, 1, function(i) which(i == 1))} f_with_data.frame <- function(x) {with(data.frame(which(!!x, arr.ind=T)), split(col, row))} f_OP <- function(x) {for(i in 1:dim(x)[1]){which(x[i,]==1)->j[[i]]}} res <- microbenchmark( f_which_and_lappy(M), f_only_apply(M), f_with_data.frame(M), f_OP(M),times=1000L) #> res #Unit: microseconds # expr min lq mean median uq max neval cld # f_which_and_lappy(M) 11063.170 11254.032 12090.9506 11351.1830 11570.662 31313.48 1000 d # f_only_apply(M) 3204.572 3359.410 4117.4971 3456.3960 3610.945 25352.35 1000 b # f_with_data.frame(M) 739.556 811.906 912.4726 918.0315 946.700 18623.77 1000 a # f_OP(M) 5642.639 5854.751 6955.9980 5969.3685 6151.209 148847.22 1000 c
summary(M)and, then,split($j, $i)or, probably more efficient,split(rep(seq_len(ncol(M)), diff(M@p)), M@i + 1L)