22

I want to turn my array of array into just a single array. From something like :

array([ array([[0, 0, 0, ..., 1, 0, 0], [0, 1, 0, ..., 0, 0, 0], [0, 0, 0, ..., 2, 0, 0], ..., array([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 8, 0, 2], ..., [0, 0, 0, ..., 0, 0, 0], [1, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 1, 0, 0]], dtype=uint8)], dtype=object) 

which has size (10,) to just the 3D numpy array which is of size (10,518, 32)

array([[[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]]], dtype=uint8) 

I've tried converting everything into a list then do np.asarray and also tried defining everything as the same dtype=uint8 but I couldn't get it into the 3D form.

7
  • This looks like a case where you should fix the problem upstream. Why do you even have arrays of arrays? This should have been a single 3D array from the start. Commented Feb 4, 2016 at 4:40
  • I agree with the first comment, but you could also do a list comprehension with the np.array.tolist(). Something like np.array(arr.tolist() for arr in my_arrays) Commented Feb 4, 2016 at 5:18
  • could np.reshape() not be of use here? Commented Feb 4, 2016 at 5:34
  • Could you please post an example that we can run, say with a (2,4,3) array of arrays? If I type what I think is a small version of your example, I don't get a shape (10,). Commented Feb 4, 2016 at 5:48
  • Are you certain, all contained arrays have the same shape? Commented Feb 4, 2016 at 8:28

4 Answers 4

23

np.concatenate should do the trick:

Make an object array of arrays:

In [23]: arr=np.empty((4,),dtype=object) In [24]: for i in range(4):arr[i]=np.ones((2,2),int)*i In [25]: arr Out[25]: array([array([[0, 0], [0, 0]]), array([[1, 1], [1, 1]]), array([[2, 2], [2, 2]]), array([[3, 3], [3, 3]])], dtype=object) In [28]: np.concatenate(arr) Out[28]: array([[0, 0], [0, 0], [1, 1], [1, 1], [2, 2], [2, 2], [3, 3], [3, 3]]) 

Or with a reshape:

In [26]: np.concatenate(arr).reshape(4,2,2) Out[26]: array([[[0, 0], [0, 0]], [[1, 1], [1, 1]], [[2, 2], [2, 2]], [[3, 3], [3, 3]]]) In [27]: _.shape Out[27]: (4, 2, 2) 

concatenate effectively treats its input as a list of arrays. So it works regardless of whether this is an object array, a list, or 3d array.

This can't be done simply with a reshape. arr is an array of pointers - pointing to arrays located elsewhere in memory. To get a single 3d array, all of the pieces will have to be copied into one buffer. That's what concatenate does - it creates a large empty file, and copies each array, but it does it in compiled code.


np.array does not change it:

In [37]: np.array(arr).shape Out[37]: (4,) 

but treating arr as a list of arrays does work (but is slower than the concatenate version - array analyses its inputs more).

In [38]: np.array([x for x in arr]).shape Out[38]: (4, 2, 2) 
Sign up to request clarification or add additional context in comments.

2 Comments

The [x for x in arr] should be replaced by arr.tolist(). This would lead to faster code than np.concatenate().
@norok2, I don't see a consistent difference in timings.
6

Perhaps late to the party, but I believe the most efficient approach is:

np.array(arr.tolist()) 

To give some idea of how it would work:

import numpy as np N, M, K = 4, 3, 2 arr = np.empty((N,), dtype=object) for i in range(N): arr[i] = np.full((M, K), i) print(arr) # [array([[0, 0], # [0, 0], # [0, 0]]) # array([[1, 1], # [1, 1], # [1, 1]]) # array([[2, 2], # [2, 2], # [2, 2]]) # array([[3, 3], # [3, 3], # [3, 3]])] new_arr = np.array(arr.tolist()) print(new_arr) # [[[0 0] # [0 0] # [0 0]] # [[1 1] # [1 1] # [1 1]] # [[2 2] # [2 2] # [2 2]] # [[3 3] # [3 3] # [3 3]]] 

...and the timings:

%timeit np.array(arr.tolist()) # 100000 loops, best of 3: 2.48 µs per loop %timeit np.concatenate(arr).reshape(N, M, K) # 100000 loops, best of 3: 3.28 µs per loop %timeit np.array([x for x in arr]) # 100000 loops, best of 3: 3.32 µs per loop 

Comments

5

I had the same issue extracting a column from a Pandas DataFrame containing an array in each row:

joined["ground truth"].values # outputs array([array([0, 0, 0, 0, 0, 0, 0, 0]), array([0, 0, 0, 0, 0, 0, 0, 0]), array([0, 0, 0, 0, 0, 0, 0, 0]), ..., array([0, 0, 0, 0, 0, 0, 0, 0]), array([0, 0, 0, 0, 0, 0, 0, 0]), array([0, 0, 0, 0, 0, 0, 0, 0])], dtype=object) 

np.concatenate didn't help because it merged the arrays into a flat array (same as np.hstack). Instead, I needed to vertically stack them with np.vstack:

array([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]]) 

Comments

0

One way is to allocate the target array and copy the objects in as a loop.

import numpy as np x = np.array([ np.array([[0, 0, 0, 1, 0, 0], [0, 1, 0, 0, 0, 0], [0, 0, 3, 7, 0, 0], [0, 0, 0, 2, 0, 0]], dtype=np.uint8), np.array([[0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0], [0, 0, 4, 8, 0, 0], [0, 0, 0, 8, 0, 2]], dtype=np.uint8), np.array([[0, 0, 0, 0, 0, 0], [1, 0, 0, 0, 0, 0], [0, 0, 5, 9, 0, 0], [0, 0, 0, 1, 0, 0]], dtype=np.uint8)], dtype=object) print len(x) print x[0].shape y=np.zeros([len(x),x[0].shape[0],x[0].shape[1]],dtype=np.uint8) print y.shape for i in range(len(x)): y[i,:,:] = x[i] print y 

If I understand what you're asking this is the desired result:

3 (4L, 6L) (3L, 4L, 6L) [[[0 0 0 1 0 0] [0 1 0 0 0 0] [0 0 3 7 0 0] [0 0 0 2 0 0]] [[0 0 0 0 0 0] [0 0 0 0 0 0] [0 0 4 8 0 0] [0 0 0 8 0 2]] [[0 0 0 0 0 0] [1 0 0 0 0 0] [0 0 5 9 0 0] [0 0 0 1 0 0]]] 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.