R: Efficient Way to partly modify diagonal of matrix

Question

I have a square matrix with dimension ranging from 100x100 to 10,000x10,000. The matrix represents parameter values for a function. I go through a loop where I try various combinations of parameters. Every iteration W has a different value so I have to update the locations in the matrix which have the value W. This happens to be the even entries of the diagonal, so [2,2], [4,4], etc. I am curious if there is a more efficient way to do this than my current approach:

W<-treeDepth*newVar for (iW in evenDiag) { matrixSource[iW,iW]<-W }

So far I've only tested with 134x134 size matrices, but looping appears to be the fastest approach according to analysis with profvis.

When I try

diag(matrixSource)[evenDiag]<-W

It appears to take similar amounts of time, but it starts calling < GC> after an average of every 5 calls or so, but the actual times < GC> is called appear to be random.

I think < GC> is garbage collection, but whatever it is, it takes forever, and it's rarely if ever called when I have the loop version above.

Am I wrong to think there's a better approach than looping one by one. Would writing the loop in Rcpp make it faster? Hadley in "Advanced R" doesn't say that writing values into a matrix would be faster with Rcpp so it probably doesn't. If it did, how would I change that one little line(s) to Rcpp without making a C++ function or anything complicated (I don't know any C++).

According to my research it isn't possible to just write something like

matrixSource[evenDiag,evenDiag]<-W

but that if it were, R would excel at vectorized processing.

What is the best approach for this?

If it's helpful, the context is that the matrix needs to be fed into

negLogLik<- -mvtnorm::dmvnorm(flattenedData,sigma=matrixSource,log=T,checkSymmetry = FALSE)

and internally to that function it's fed into chol() (which sometimes calls < GC> in parallel)

So if there's a way I can modify that function to work with only partial matrices that aren't fully created so that maybe I only have to assign the value of W once, that'd also be good.

I also need a way to assign the diagonal directly above and below the main diagonal all the same value, Z. Is there a good way to do that?

Thank you <3

matrixSource[cbind(iW,iW)]<-W should do what you want, can't speak to efficiency offhand. — MichaelChirico
– MichaelChirico, Commented Jan 14, 2024 at 8:04
you could also try regular vector indexing, e.g. matrixSource[(iW - 1)*nrow(matrixSource) + iW] <- W (might be off-by-one), or changing to sparse matrix representation of appropriate. — MichaelChirico
– MichaelChirico, Commented Jan 14, 2024 at 8:15

Rui Barradas · Accepted Answer · 2024-01-15 14:09:53Z

I have not found memory problems with square matrices up to dimension 1000.
@MichaelChirico's suggestion seems to be the most efficient. It runs in time constant with the matrix dimension, unlike diag<-.
The two methods used below are

# create an index i <- rep(c(FALSE, TRUE), ncol(A) %/% 2) j <- which(i) # can also be diag(A)[i] diag(A)[j] <- W # cannot be cbind(i, i) A[cbind(j, j)] <- 0

cbind is clearly more efficient.

library(microbenchmark) library(ggplot2) testFun <- function(N) { out <- lapply(seq.int(N)[-1L], \(n) { A <- matrix(1:n^2, n, n) W <- rep(0L, ncol(A) %/% 2L) i <- rep(c(FALSE, TRUE), ncol(A) %/% 2) j <- which(i) mb <- microbenchmark( diag = {diag(A)[i] <- W}, cbind = {A[cbind(j, j)] <- W} ) mb$dim <- n mb }) out <- do.call(rbind, out) aggregate(time ~ ., out, median) } testData <- testFun(1000) ggplot(testData, aes(dim, time, colour = expr)) + geom_line() + geom_point() + theme_bw()

^{Created on 2024-01-14 with reprex v2.0.2}

Edit

Another way of assigning the diagonal elements of a square matrix is proposed by user20650 in comment. Change the timimg instruction call above to

mb <- microbenchmark( diag = {diag(A)[i] <- W}, cbind = {A[cbind(j, j)] <- W}, sq = {A[j*j] <- W} )

and the plot (without diag) becomes

wow. this is amazing. thanks! I only run cbind once since the entries that take the value W are always the same. But it does appear that running cbind once and then executing matrixSource[evenDiagBinded]<-W each iteration is a little faster than running for (iW in evenDiag) { matrixSource[iW,iW]<-W } each iteration after compilation (not sure about uncompiled). Probably the matrixSource[evenDiagBinded]<-W syntax gives the most efficient code for changing the matrix values?
@AFriendlyFish If you run cbind to see its code you will see .Internal(cbind(deparse.level, ...)). cbind is written in optimized C and therefore I was expecting it to be efficient. Even difficult to improve upon. But the indexing has code to handle non-square matrices so it might be further optimized, I guess.
as updating the diagonal of a square matrix could use sq = {A[j*j] <- W} which I'd think would be faster (only tested on small matrices)
@user20650 You are right, I have tested it on matrices with dim = 2:1000 and it's faster.
Thanks @RuiBarradas. Interesting that something seems to happen around 600 dimension.

Collectives™ on Stack Overflow

R: Efficient Way to partly modify diagonal of matrix

1 Answer 1

Edit

5 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Edit

5 Comments

Related