6

I have two numpy-arrays:

p_a_colors=np.array([[0,0,0], [0,2,0], [119,103,82], [122,122,122], [122,122,122], [3,2,4]]) p_rem = np.array([[119,103,82], [122,122,122]]) 

I want to delete all the columns from p_a_colors that are in p_rem, so I get:

p_r_colors=np.array([[0,0,0], [0,2,0], [3,2,4]]) 

I think, something should work like

p_r_colors= np.delete(p_a_colors, np.where(np.all(p_a_colors==p_rem, axis=0)),0) 

but I just don't get the axis or [:] right.

I know, that

p_r_colors=copy.deepcopy(p_a_colors) for i in range(len(p_rem)): p_r_colors= np.delete(p_r_colors, np.where(np.all(p_r_colors==p_rem[i], axis=-1)),0) 

would work, but I am trying to avoid (python)loops, because I also want the performance right.

2
  • Hold on. What is this code supposed to do? Commented May 30, 2013 at 14:59
  • It should give me a new numpy.array p_r_colors, which is p_a_colors-p_rem, same shape as the 2 other arrays Commented May 30, 2013 at 15:09

3 Answers 3

7

This is how I would do it:

dtype = np.dtype((np.void, (p_a_colors.shape[1] * p_a_colors.dtype.itemsize))) mask = np.in1d(p_a_colors.view(dtype), p_rem.view(dtype)) p_r_colors = p_a_colors[~mask] >>> p_r_colors array([[0, 0, 0], [0, 2, 0], [3, 2, 4]]) 

You need to do the void dtype thing so that numpy compares rows as a whole. After that using the built-in set routines seems like the obvious way to go.

Sign up to request clarification or add additional context in comments.

Comments

1

It's ugly, but

tmp = reduce(lambda x, y: x | np.all(p_a_colors == y, axis=-1), p_rem, np.zeros(p_a_colors.shape[:1], dtype=np.bool)) indices = np.where(tmp)[0] np.delete(p_a_colors, indices, axis=0) 

(edit: corrected)

>>> tmp = reduce(lambda x, y: x | np.all(p_a_colors == y, axis=-1), p_rem, np.zeros(p_a_colors.shape[:1], dtype=np.bool)) >>> >>> indices = np.where(tmp)[0] >>> >>> np.delete(p_a_colors, indices, axis=0) array([[0, 0, 0], [0, 2, 0], [3, 2, 4]]) >>> 

Comments

1

You are getting the indices wrong. The expression p_a_colors==p_rem evaluates to an empty array, because the two arrays are never equal (they have different shapes!). If you want to use np.delete, you need a more correct list of indices.

On the other hand, this can be more easily done with indices:

>>> idx = np.array([p_a_colors[i] not in p_rem for i in range(p_a_colors.shape[0])], dtype='bool') >>> p_a_colors[idx] array([[0, 0, 0], [0, 2, 0], [3, 2, 4]]) 

Or, inspired by the suggestion of @Jaime, you can also create the indices with np.in1d, here in one line:

>>> idx = ~np.all(np.in1d(p_a_colors, p_rem).reshape(p_a_colors.shape), axis=1) >>> p_a_colors[idx] array([[0, 0, 0], [0, 2, 0], [3, 2, 4]]) 

If you must use np.delete, just convert the list of indices from bool to a sequence:

>>> idx = np.array([p_a_colors[i] in p_rem for i in range(p_a_colors.shape[0])]) >>> idx = np.arange(p_a_colors.shape[0])[idx] >>> np.delete(p_a_colors, idx, axis=0) array([[0, 0, 0], [0, 2, 0], [3, 2, 4]]) 

4 Comments

something went wrong. I tried it with large arrays and it deleted to much.
Which version did you try? Did the arrays have more than 2 dimensions?
I did the second version. No everything should have been the same, but with much more values. I already implemted Jaime's version. That seems to work fine.
The resaon ~np.all(np.in1d(p_a_colors, p_rem).reshape(p_a_colors.shape),axis=1) gives incorrect results is that it just says (before negation) that all members of a given row are in the other array, but not necessarily in the same row.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.