Prevent and correct numpy arrays from being nested rather than multidimensional

Question

Sometimes when creating new 2d arrays I end up with nested arrays rather than proper multidimensional ones. This leads to a number of complications such as misleading array.shape values.

What I mean is I end up with

array([array([...]), array([...])], dtype=object)

when I want

array([[...], [...]])

I'm not sure at which point in my code leads to the former scenario. I was wondering 1. what is good practice to avoid obtaining such arrays, and 2. any pragmatic fixes to revert it to the multidimensional form.

Regarding the latter, performing

np.array([list(i) for i in nested_array])

works but doesn't seem practical, especially if the dimensionality is higher.

Unless you are intentionally creating nested/ragged arrays, there is something wrong with the lists or arrays that you are starting with. They need to be consistent in size. If they differ, you either get the ragged array warning or an error. [a.shape for a in nested_array] should tell you the shapes. — hpaulj
– hpaulj, Commented Mar 29, 2022 at 23:12
Once you have a object dtype array, applying np.stack (or vstack) may change it to multidimensional array - provided all component arrays match. It in effect treats the array as a list of arrays. If the shapes aren't consistent, this will raise an error - which may be useful information. Also this only applies to 1d object dtype arrays. If this doesn't help, you may need to ask a new question with details about how you create the problem array. — hpaulj
– hpaulj, Commented Mar 30, 2022 at 5:00

Grismar · Accepted Answer · 2022-03-29 22:54:38Z

1

If you have an array of arrays, for example:

import numpy as np

arr1 = np.empty(3,object)
arr1[:] = [np.arange(3), np.arange(3, 6), np.arange(6, 9)]
repr(arr1)

Result:

array([array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])], dtype=object)

Note the dtype there. That may be causing some of your trouble, compare:

arr2 = np.array([np.arange(3), np.arange(3, 6), np.arange(6, 9)])
print(arr2)
print(arr2.dtype)

Result:

[[0 1 2]
 [3 4 5]
 [6 7 8]]
int32

To turn arr1 into an array just like arr2, which is what you are asking about:

arr3 = np.stack(arr1)
print(arr3)
print((arr2 == arr3).all())

Result:

[[0 1 2]
 [3 4 5]
 [6 7 8]]
True

So, make sure your arrays have the datatype you need, and if you cannot avoid ending up with an array of arrays, combine them with numpy.stack().

answered Mar 29, 2022 at 22:54

Grismar

32.4k6 gold badges42 silver badges68 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Prevent and correct numpy arrays from being nested rather than multidimensional

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related