0

I have multiple images (they are of the same size) which I want to put in a numpy array so that the result is a array of dimension (len(img_list), 1).

img_list = [img, img, img]
img.shape, type(img_list), len(img_list)

((1056, 2034, 3), list, 3)

My main problem is that the following happens when I use numpy.array():

a = np.array(img_list, dtype=np.object)
type(a), a.dtype, a.shape, a.ndim

(numpy.ndarray, dtype('O'), (3, 1056, 2034, 3), 4)

Notice the dimensions is four, instead of two as expected.

So far the best method I found to get dimension (len(img_list), 1) is to create an empty array of the desired dimension and then use broadcasting:

a = np.empty((3,1), dtype=np.object)
type(a), a.dtype, a.shape, a

(numpy.ndarray, dtype('O'), (3, 1), array([[None], [None], [None]], dtype=object))

a[:,0] = img_list
type(a), a.dtype, a.shape

(numpy.ndarray, dtype('O'), (3, 1))

This yield the desired dimension.

Is there a numpy function that can do that directly without creating an empty array first?


EDIT

I thought using numpy.hstack or numpy.stack should do the trick but this results in the "wrong" dimension:

a_stacked = np.stack(img_list)
type(a_stacked), a_stacked.dtype, a_stacked.shape, a.ndim

(numpy.ndarray, dtype('uint8'), (3, 1056, 2034, 3), 4)

To clarify: I would like a.ndim == 2 and not a.ndim == 4. In other words, a.shape should be (3,1) and not (3, 1056, 2034, 3).

6
  • May I ask why do you want to keep the number of dimensions to be 2? Probably there's a better solution to your more general problem. Commented Jan 24, 2020 at 23:25
  • Thais seems like an XY problem. I can't imagine reason for wanting an object array of same-shaped numerical arrays. dtype('O') arrays break most numpy methods, and require treating them like large, slow-performing lists. Commented Jan 25, 2020 at 7:58
  • In my problem the first dimension should represent the samples (number of images - image samples). The shape of the second dimension shows the number of features. At the start of a scikit-learn processing pipeline I want to operate on the image feature as a whole. I couldn't find a fast numpy method that iterates over the first dimension (samples) and applies an operation to the rest (image feature). Please correct me if there is a more appropriate way. Commented Jan 25, 2020 at 8:16
  • @DanielF, I wouldn't go this way if I could use vectorization. However, the functions I am dealing with, mostly operate on single images. To speed things up I would have to change the functions first to operate on speedy numpy arrays. Commented Jan 25, 2020 at 8:24
  • Then make it a nested list of arrays. It'll be faster an you'll have direct access to list comprehension tools. Commented Jan 25, 2020 at 9:53

1 Answer 1

1

I'm not sure what you want to achieve, but wouldn't numpy.stack() do? That's how I usually create batches of images.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for suggesting numpy.stack(), I tried that too but the output regarding shape and ndim is not what I want. I've added the desired output shape and dimension.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.