Sums of arrays in an array

Question

Is there a speedy way to do this;

import numpy as np a=np.array([1,2,3,4]) b=np.array([1,2]) c=np.array([a,b]) result=magic(c)

where magic() is the functionality I want and result should be np.array([10,3]) i.e. a numpy.array containing the sums of each of the input arrays.

Are a, b, etc longer or shorter than c, typically? Is c already created, or can we bypass that step? — askewchan
– askewchan, Commented Sep 18, 2015 at 23:29
Well I was hoping for a nice numpy implementation avoiding loops. @Oliver. W tells me that operations on unequal length arrays are not efficient in numpy so maybe the answer is to think of a different way to store my data... — fen
– fen, Commented Sep 18, 2015 at 23:32
@askewchan a and b can vary between length 1 and a few 100. c typically has length of a few tens and c is already created. — fen
– fen, Commented Sep 18, 2015 at 23:41
np.add.reduceat(d,ind) is a fast way of summing uneven blocks of the array d. But constructing d (np.hstack(c)) and ind takes time. — hpaulj
– hpaulj, Commented Sep 19, 2015 at 0:11

askewchan · Accepted Answer · 2015-09-19 00:42:58Z

Here's a compilation of the suggestions (answers and comments) and their timings:

import numpy as np c = np.array([np.random.rand(np.random.randint(1, 300)) for i in range(50)]) def oliver(arr): res = np.empty_like(arr) for enu, subarr in enumerate(arr): res[enu] = np.sum(subarr) return res def reut(arr): return np.array([a.sum() for a in arr]) def hpaulj(arr): d = np.concatenate(arr) l = map(len, arr) i = np.cumsum(l) - l return np.add.reduceat(d, i)

And their times:

In [94]: timeit oliver(c) 1000 loops, best of 3: 457 µs per loop In [95]: timeit reut(c) 1000 loops, best of 3: 317 µs per loop In [96]: timeit hpaulj(c) 10000 loops, best of 3: 94.4 µs per loop

It was somewhat tricky to implement @hpaulj's, but I think I got it (and it's the fastest if you use concatenate instead of hstack)

Oliver W. · Accepted Answer · 2015-09-18 22:55:48Z

Of the many possible solutions, this is one:

def your_magic(arr): res = np.empty_like(arr) for enu, subarr in enumerate(arr): res[enu] = np.sum(subarr) return res

Be mindful though that making a numpy array of unequal length arrays is not efficient at all and is just very similar to adding arrays in a normal Python list. This is the reason the returned array res in the function above will in general be of the dtype object.

Thanks! when you say not efficient are we talking memory/speed/both?
@fen speed: numpy was designed for computing on contiguous blocks of memory. By making it into a forced list of objects, you're adding unnecessary housekeeping under the hood. Have a look at this thread, in particular the first comment and first post. @Reut's comment to your post is an elegant way as well to define your magic function (although a case could be made to use the numpy method sum rather than the python builtin).

Collectives™ on Stack Overflow

Sums of arrays in an array

2 Answers 2

Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Linked

Related