Vectorizing operation on numpy array

Question

I have a numpy array containing many three-dimensional numpy arrays, where each of these sub-elements is a grayscale image. I want to use numpy's vectorize to apply an affine transformation to each image in the array.

Here is a minimal example that reproduces the issue:

import cv2 import numpy as np from functools import partial # create four blank images data = np.zeros((4, 1, 96, 96), dtype=np.uint8) M = np.array([[1, 0, 0], [0, 1, 0]], dtype=np.float32) # dummy affine transformation matrix size = (96, 96) # output image size

Now I want to pass each of the images in data to cv2.warpAffine(src, M, dsize). Before I vectorize it, I first create a partial function that binds M and dsize:

warpAffine = lambda M, size, img : cv2.warpAffine(img, M, size) # re-order function parameters partialWarpAffine = partial(warpAffine, M, size) vectorizedWarpAffine = np.vectorize(partialWarpAffine) print data[:, 0].shape # prints (4, 96, 96) vectorizedWarpAffine(data[:, 0])

But this outputs:

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1573, in __call__ return self._vectorize_call(func=func, args=vargs) File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1633, in _vectorize_call ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args) File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1597, in _get_ufunc_and_otypes outputs = func(*inputs) File "<stdin>", line 1, in <lambda> TypeError: src is not a numpy array, neither a scalar

What am I doing wrong - why can't I vectorize an operation on numpy arrays?

vectorize facilitates broadcasting. If you don't need that, just hide your loop in a function. — hpaulj
– hpaulj, Commented Feb 15, 2015 at 4:46

Community · Accepted Answer · 2017-05-23 11:59:30Z

The problem is that just by using partial it doesn't make the existence of the other arguments go away for the sake of vectorize. The function underlying the partial object will be vectorizedWarpAffine.pyfunc, which will keep track of whatever pre-bound arguments you'd like it to use when calling vectorizedWarpAffine.pyfunc.func (which is still a multi-argumented function).

You can see it like this (after you import inspect):

In [19]: inspect.getargspec(vectorizedWarpAffine.pyfunc.func) Out[19]: ArgSpec(args=['M', 'size', 'img'], varargs=None, keywords=None, defaults=None)

To get around this, you can use the excluded option to np.vectorize which says which arguments (positonal or keyword) to ignore when wrapping the vectorization behavior:

vectorizedWarpAffine = np.vectorize(partialWarpAffine, excluded=set((0, 1)))

When I make this change, the code appears to actually execute the vectorized function now, but it hits an actual error in the imagewarp.cpp code, presumably due to some bad data assumption on this test data:

In [21]: vectorizedWarpAffine(data[:, 0]) OpenCV Error: Assertion failed (cn <= 4 && ssize.area() > 0) in remapBilinear, file -------src-dir-------/opencv-2.4.6.1/modules/imgproc/src/imgwarp.cpp, line 2296 --------------------------------------------------------------------------- error Traceback (most recent call last) <ipython-input-21-3fb586393b75> in <module>() ----> 1 vectorizedWarpAffine(data[:, 0]) /home/ely/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.pyc in __call__(self, *args, **kwargs) 1570 vargs.extend([kwargs[_n] for _n in names]) 1571 -> 1572 return self._vectorize_call(func=func, args=vargs) 1573 1574 def _get_ufunc_and_otypes(self, func, args): /home/ely/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.pyc in _vectorize_call(self, func, args) 1628 """Vectorized call to `func` over positional `args`.""" 1629 if not args: -> 1630 _res = func() 1631 else: 1632 ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args) /home/ely/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.pyc in func(*vargs) 1565 the_args[_i] = vargs[_n] 1566 kwargs.update(zip(names, vargs[len(inds):])) -> 1567 return self.pyfunc(*the_args, **kwargs) 1568 1569 vargs = [args[_i] for _i in inds] /home/ely/programming/np_vect.py in <lambda>(M, size, img) 10 size = (96, 96) # output image size 11 ---> 12 warpAffine = lambda M, size, img : cv2.warpAffine(img, M, size) # re-order function parameters 13 partialWarpAffine = partial(warpAffine, M, size) 14 error: -------src-dir-------/opencv-2.4.6.1/modules/imgproc/src/imgwarp.cpp:2296: error: (-215) cn <= 4 && ssize.area() > 0 in function remapBilinear

As a side note: I am seeing a shape of (4, 96, 96) for your data, not (4, 10, 10).

Also note that using np.vectorize is not a technique for improving the performance of a function. All it does is gently wrap your function call inside a superficial for-loop (albeit at the NumPy level). It is a technique for writing functions that automatically adhere to NumPy broadcasting rules and for making your API superficially similar to NumPy's API, whereby function calls are expected to work correctly on top of ndarray arguments.

See this post for more details.

Added: The main reason you are using partial in this case is to get a new function that's ostensibly "single-argumented" but that doesn't work out as planned based on the way partial works. So why not just get rid of partial all together?

You can leave your lambda function exactly as it is, even with the two non-array positional arguments, but still ensure that the third argument is treated as something to vectorize over. To do this, you just use excluded as above, but you also need to tell vectorize what to expect as the output.

The reason for this is that vectorize will try to determine what the output shape is supposed to be by running your function on the first element of the data you supply. In this case (and I am not fully sure and it would be worth more debugging) this seems to create the "src is not numpy array" error you were seeing.

So to prevent vectorize from even trying it, you can provide a list of the output types yourself, like this:

vectorizedWarpAffine = np.vectorize(warpAffine, excluded=(0, 1), otypes=[np.ndarray])

and it works:

In [29]: vectorizedWarpAffine(M, size, data[:, 0]) Out[29]: array([[[ array([[ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], ..., ...

I think this is a lot nicer because now when you call vectorizedWarpAffine you still explicitly utilize the other positional arguments, instead of the layer of misdirection where they are pre-bound with partial, and yet the third argument is still treated vectorially.

Thanks! The (4, 10, 10) was just my own stupidity, I've fixed it now. After applying your change regarding excluded=set(0,1) it works for me. Strangely enough, I also see an OpenCV error, even though partialWarpAffine now works fine when I pass it individual elements of the data array.
My reason for using vectorize was because the code would be cleaner, I'm aware that it brings no performance benefits. However, now I'm wondering if it might not be easier to use a standard loop. Unless you can suggest an alternative for applying a transformation to many images?
A standard loop might be fine, it will almost surely be more readable. Also, I'm going to add some edits at the bottom in a second to show how you can do this directly with np.vectorize and you don't need partial at all.
Thanks for the helpful explanations! I agree that it's better to avoid partial in this case. I'm still trying to figure out why OpenCV is giving an error, but I'm starting to think a standard loop might be the way to go here.

Collectives™ on Stack Overflow

Vectorizing operation on numpy array

1 Answer 1

4 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Linked

Related