Sum over squared array

Question

As part of a batch Euclidean distance computation, I'm computing

(X * X).sum(axis=1)

where X is a rather large 2-d array. This works fine, but it constructs a temporary array of the same size as X. Is there any way to get rid of this temporary, but retain the efficiency of a vectorized operation?

The obvious candidate,

np.array([np.dot(row, row) for row in X])

works, but uses a Python list as a temporary, making it rather slow.

Without the axis, the memory-efficient form would be

(X * X).sum() => np.dot(X.ravel(), X.ravel())

and I know that, when axis=1, it's equivalent to

np.diag(np.dot(X, X.T))

which got me looking into generalizations of dot such as np.inner, np.tensordot and np.einsum, but I can't figure out how they would solve my problem.

You should check the following: stackoverflow.com/questions/17527340/… For large arrays, the memory layout becomes very important apparently — usethedeathstar
– usethedeathstar, Commented Sep 30, 2013 at 13:27
@usethedeathstar: I'm aware of that; Euclidean distance is a dot product plus the sum of two of the above operations, and the layout is already optimized for the dot product. — Fred Foo
– Fred Foo, Commented Sep 30, 2013 at 13:32

YXD · Accepted Answer · 2013-09-30 12:53:33Z

11

The einsum equivalent is:

np.einsum('ij,ij->i', X, X)

Though I am not sure how this works internally so it may or may not not solve your problem.

answered Sep 30, 2013 at 12:53

YXD

32.6k15 gold badges79 silver badges117 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Fred Foo Over a year ago

A first test shows that it's extremely fast. I'll do some memory profiling.

Daniel Over a year ago

Einsum is probably one of the best functions numpy has for handling large arrays. It is compiled entirely in C and takes advantage of SSE2 which numpy's ufuncs will not use until 1.8. Also it does try to avoid holding large numpy intermediates in memory. Source code is here.

Jaime Over a year ago

The other alternative is inner1d: numpy.core.umath_tests import inner1d; inner1d(X, X), but np.einsum is faster most of the time, or on most systems.

Fred Foo Over a year ago

@Jaime: is that even supposed to be a public function?

Jaime Over a year ago

I think it is still experimental, but it is here to stay. inner1d and matrix_multiply are the two first instances of gufuncs, where the 'g' is for 'generalized', here's what the API docs say. In numpy 1.8 there will be a whole batch of linear algebra gufuncs, including those two, browse the PR to see what will be added. Best thing of gufuncs is they make it really, really easy to extend numpy with C when you are dealing with more complicated algorithms.

|

Collectives™ on Stack Overflow

Sum over squared array

1 Answer 1

6 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Linked

Related