find and delete from more-dimensional numpy array

Question

I have two numpy-arrays:

p_a_colors=np.array([[0,0,0], [0,2,0], [119,103,82], [122,122,122], [122,122,122], [3,2,4]]) p_rem = np.array([[119,103,82], [122,122,122]])

I want to delete all the columns from p_a_colors that are in p_rem, so I get:

p_r_colors=np.array([[0,0,0], [0,2,0], [3,2,4]])

I think, something should work like

p_r_colors= np.delete(p_a_colors, np.where(np.all(p_a_colors==p_rem, axis=0)),0)

but I just don't get the axis or [:] right.

I know, that

p_r_colors=copy.deepcopy(p_a_colors) for i in range(len(p_rem)): p_r_colors= np.delete(p_r_colors, np.where(np.all(p_r_colors==p_rem[i], axis=-1)),0)

would work, but I am trying to avoid (python)loops, because I also want the performance right.

It should give me a new numpy.array p_r_colors, which is p_a_colors-p_rem, same shape as the 2 other arrays — a.j. tawleed
– a.j. tawleed, Commented May 30, 2013 at 15:09

Jaime · Accepted Answer · 2013-05-30 15:39:30Z

This is how I would do it:

dtype = np.dtype((np.void, (p_a_colors.shape[1] * p_a_colors.dtype.itemsize))) mask = np.in1d(p_a_colors.view(dtype), p_rem.view(dtype)) p_r_colors = p_a_colors[~mask] >>> p_r_colors array([[0, 0, 0], [0, 2, 0], [3, 2, 4]])

You need to do the void dtype thing so that numpy compares rows as a whole. After that using the built-in set routines seems like the obvious way to go.

YXD · Accepted Answer · 2013-05-30 15:34:51Z

It's ugly, but

tmp = reduce(lambda x, y: x | np.all(p_a_colors == y, axis=-1), p_rem, np.zeros(p_a_colors.shape[:1], dtype=np.bool)) indices = np.where(tmp)[0] np.delete(p_a_colors, indices, axis=0)

(edit: corrected)

>>> tmp = reduce(lambda x, y: x | np.all(p_a_colors == y, axis=-1), p_rem, np.zeros(p_a_colors.shape[:1], dtype=np.bool)) >>> >>> indices = np.where(tmp)[0] >>> >>> np.delete(p_a_colors, indices, axis=0) array([[0, 0, 0], [0, 2, 0], [3, 2, 4]]) >>>

tiago · Accepted Answer · 2013-05-30 15:54:32Z

You are getting the indices wrong. The expression p_a_colors==p_rem evaluates to an empty array, because the two arrays are never equal (they have different shapes!). If you want to use np.delete, you need a more correct list of indices.

On the other hand, this can be more easily done with indices:

>>> idx = np.array([p_a_colors[i] not in p_rem for i in range(p_a_colors.shape[0])], dtype='bool') >>> p_a_colors[idx] array([[0, 0, 0], [0, 2, 0], [3, 2, 4]])

Or, inspired by the suggestion of @Jaime, you can also create the indices with np.in1d, here in one line:

>>> idx = ~np.all(np.in1d(p_a_colors, p_rem).reshape(p_a_colors.shape), axis=1) >>> p_a_colors[idx] array([[0, 0, 0], [0, 2, 0], [3, 2, 4]])

If you must use np.delete, just convert the list of indices from bool to a sequence:

>>> idx = np.array([p_a_colors[i] in p_rem for i in range(p_a_colors.shape[0])]) >>> idx = np.arange(p_a_colors.shape[0])[idx] >>> np.delete(p_a_colors, idx, axis=0) array([[0, 0, 0], [0, 2, 0], [3, 2, 4]])

something went wrong. I tried it with large arrays and it deleted to much.
Which version did you try? Did the arrays have more than 2 dimensions?
I did the second version. No everything should have been the same, but with much more values. I already implemted Jaime's version. That seems to work fine.
The resaon ~np.all(np.in1d(p_a_colors, p_rem).reshape(p_a_colors.shape),axis=1) gives incorrect results is that it just says (before negation) that all members of a given row are in the other array, but not necessarily in the same row.

Collectives™ on Stack Overflow

find and delete from more-dimensional numpy array

3 Answers 3

Comments

Comments

4 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

4 Comments

Linked

Related