How sparse does a matrix need to be to be worth representing as sparse?

Question

In R, I'm trying to work with a large matrix (39,146,166 rows by 127 columns) and I'm having memory issues with a number of operations on it. I've determined that about 35% of the entries in the matrix are non-zero, and the remainder are all zeros. Is this sparse enough that I would save some memory representing this matrix using one of R's sparse matrix classes? What is a good rule of thumb for determining when a matrix is worth representing sparsely?

You may find this article helpful.

nrussell
– nrussell

2016-04-04 23:56:52 +00:00
Commented Apr 4, 2016 at 23:56 — nrussell
– nrussell, Commented Apr 4, 2016 at 23:56

IRTFM · Accepted Answer · 2016-04-05 00:54:27Z

3

I don't think the sparse representation will be that much more compact. You need three numbers for each numeric item other than an implicit zero. So even if two of those are 4 byte integers the space in memory will still be larger than a "serial" storage strategy.

By this reasoning anything above 50% will take more storage space, but I'm posting from an iPhone under SF Bay so cannot test with 'object.size'.

answered Apr 5, 2016 at 0:54

IRTFM

264k22 gold badges381 silver badges503 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Ryan C. Thompson Over a year ago

There are a number of sparse matrix formats, and not all of them require 3 numbers per nonzero entry. For example, this format requires about 2 for my case: netlib.org/linalg/html_templates/node92.html

IRTFM Over a year ago

@RyanThompson: That format requires 3 vectors, not two

Ryan C. Thompson Over a year ago

Only two of those vectors have an entry for each data point. The last vector only has one element for each column, which is negligible in my case.

IRTFM Over a year ago

And if that were any of the representations in the R Matrix package then you could use it as a basis for estimation. But as far as I can tell neither the T-matrix nor the C-matrix versions use such a method.

Collectives™ on Stack Overflow

How sparse does a matrix need to be to be worth representing as sparse?

1 Answer 1

4 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Related