Performance of vectorizing code to create a sparse matrix with a single 1 per row from a vector of indexes

Question

I have a large column vector y containing integer values from 1 to 10. I wanted to convert it to a matrix where each row is full of 0s except for a 1 at the index given by the value at the respective row of y.

This example should make it clearer:

y = [3; 4; 1; 10; 9; 9; 4; 2; ...] % gets converted to: Y = [ 0 0 1 0 0 0 0 0 0 0; 0 0 0 1 0 0 0 0 0 0; 1 0 0 0 0 0 0 0 0 0; 0 0 0 0 0 0 0 0 0 1; 0 0 0 0 0 0 0 0 1 0; 0 0 0 0 0 0 0 0 1 0; 0 0 0 1 0 0 0 0 0 0; 0 1 0 0 0 0 0 0 0 0; ... ]

I have written the following code for this (it works):

m = length(y); Y = zeros(m, 10); for i = 1:m Y(i, y(i)) = 1; end

I know there are ways I could remove the for loop in this code (vectorizing). This post contains a few, including something like:

Y = full(sparse(1:length(y), y, ones(length(y),1)));

But I had to convert y to doubles to be able to use this, and the result is actually about 3x slower than my "for" approach, using 10.000.000 as the length of y.

Is it likely that doing this kind of vectorization will lead to better performance for a very large y? I've read many times that vectorizing calculations leads to better performance (not only in MATLAB), but this kind of solution seems to result in more calculations.
Is there a way to actually improve performance over the for approach in this example? Maybe the problem here is simply that acting on doubles instead of ints isn't the best thing for comparison, but I couldn't find a way to use sparse otherwise.

maybe if you change the array names from y and Y to some thing different like x and y, that could help a little. once I was using ECG as a name and it got my code working so slow, until I realized that ecg is a MATLAB function. — Rash
– Rash, Commented Nov 2, 2014 at 20:01
That is good advice ;) Maybe it's a bit confusing for readers too. Now I can't change it or all the questions would need modifications, but I'll remember that next time. — CanisLupus
– CanisLupus, Commented Nov 2, 2014 at 21:05

Amro · Accepted Answer · 2014-11-02 20:09:57Z

Here is a test to comapre:

function [t,v] = testIndicatorMatrix() y = randi([1 10], [1e6 1], 'double'); funcs = { @() func1(y); @() func2(y); @() func3(y); @() func4(y); }; t = cellfun(@timeit, funcs, 'Uniform',true); v = cellfun(@feval, funcs, 'Uniform',false); assert(isequal(v{:})) end

function Y = func1(y) m = numel(y); Y = zeros(m, 10); for i = 1:m Y(i, y(i)) = 1; end end function Y = func2(y) m = numel(y); Y = full(sparse(1:m, y, 1, m, 10, m)); end function Y = func3(y) m = numel(y); Y = zeros(m,10); Y(sub2ind([m,10], (1:m).', y)) = 1; end function Y = func4(y) m = numel(y); Y = zeros(m,10); Y((y-1).*m + (1:m).') = 1; end

I get:

>> testIndicatorMatrix ans = 0.0388 0.1712 0.0490 0.0430

Such a simple for-loop can be dynamically JIT-compiled at runtime, and would run really fast (even slightly faster than vectorized code)!

Good benchmarking! +1. Could you use Y(m,10) = 0; instead for last two funcs, as that must speed it up a bit.
I think the OP is optimizing prematurely; I doubt this task of creating the indicator matrix is the bottleneck in the code! All of them run in a fraction of a second for a vector with millions of elements :)
Yeah, but I guess for the context of this problem and see how the approaches mentioned thus far fair, that would be fair I guess. Damn, too many "fair" used there :)
right. I guess we should mention this question: stackoverflow.com/questions/14169222/…
Actually, the original source was this one I think - undocumentedmatlab.com/blog/preallocation-performance I mean I found out about this first there.

Community · Accepted Answer · 2017-05-23 12:34:54Z

It seems you are looking for that full numeric matrix Y as the output. So, you can try this approach -

m = numel(y); Y1(m,10) = 0; %// Faster way to pre-allocate zeros than using function call `zeros` %// Source - http://undocumentedmatlab.com/blog/preallocation-performance linear_idx = (y-1)*m+(1:m)'; %//'# since y is mentioned as a column vector, %// so directly y can be used instead of y(:) Y1(linear_idx)=1; %// Y1 would be the desired output

Benchmarking

Using Amro's benchmark post and increasing the datasize a bit -

y = randi([1 10], [1.5e6 1], 'double');

And finally doing the faster pre-allocation scheme mentioned earlier of using Y(m,10)=0; instead of Y = zeros(m,10);, I got these results on my system -

>> testIndicatorMatrix ans = 0.1798 0.4651 0.1693 0.1457

That is the vectorized approach mentioned here (the last one in the benchmark suite) is giving you more than 15% performance improvement over your for-loop code (the first one in the benchmark suite). So, if you are using large datasizes and intend to get full versions of sparse matrices, this approach would make sense (in my personal opinion).

AnonSubmitter85 · Accepted Answer · 2014-11-02 19:56:26Z

Does something like this not work for you?

tic; N = 1e6; y = randperm( N ); Y = spalloc( N, N, N ); inds = sub2ind( size(Y), y(:), (1:N)' ); Y = sparse( 1:N, y, 1, N, N, N ); toc

The above outputs

Elapsed time is 0.144683 seconds.

Collectives™ on Stack Overflow

Performance of vectorizing code to create a sparse matrix with a single 1 per row from a vector of indexes

3 Answers 3

14 Comments

Benchmarking

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

14 Comments

Benchmarking

Comments

Comments

Linked

Related