1

Consider the following arrays

import numpy as np
a1 = np.array([1,2,3],dtype='object')
a2 = np.array([["A"],["D"],["R"]],,dtype='object')
a3 = np.array([["A","F"],["D"],["R"]],dtype='object')

The following two gives different types of output. Did not expected this. Is this normal?

np.c_[a1,a2]

#array([[1, 'A'],
#      [2, 'D'],
#      [3, 'R']], dtype=object)

np.c_[a1,a3]

#array([[1, list(['A', 'F'])],
#      [2, list(['D'])],
#      [3, list(['R'])]], dtype=object)

Why the first works and not the second expression? I do not see any difference between a2 and a3. Further, which concatenation method (c_, stack, concatenation) would output the same type of output without having to add other lines of code such as checking the output data type and converting it as needed.

np.concatenate((a1,a2),axis=0) # Error: ValueError: all the input arrays must have same number of dimensions

np.concatenate((a1,a3),axis=0) # works
#array([1, 2, 3, list(['A', 'F']), list(['D']), list(['R'])], dtype=object)
4
  • Numpy really isn't designed around working with arrays of list objects. Why are you using this? Commented May 2, 2020 at 8:55
  • @juanpa.arrivillaga what do you suggest for this kind of operations. I need to append a1 and a2 sidewise. Commented May 2, 2020 at 11:43
  • 1
    What common behavior are you expecting? Look at a2 and a3. They are very different. Commented May 2, 2020 at 15:23
  • You should just be using python lists. Why use numpy? Commented May 2, 2020 at 18:16

2 Answers 2

1

That actually makes sense, see the types of each one of the arrays:

a1 = np.array([1,2,3],dtype='object') => 1D array of objects, size 3
a2 = np.array([["A"],["D"],["R"]],,dtype='object') => 2D array of objects, size 3x1
a3 = np.array([["A","F"],["D"],["R"]],dtype='object') => 1D array of lists of objects

a3 is an array of lists, as np 2d (or more-d) arrays are matrices, you cant have 1 row with the size of 1, and the second in the size of 3, much of np computational efficiency is due to the way the arrays are organized in the memory.

So numpy interprets np.array([["A","F"],["D"],["R"]],dtype='object') as an array of lists (which are also, objects). Attempting this with a different type, will results in error:

np.array([[1,2],[3],[4]],dtype=np.int64) -->
ValueError: setting an array element with a sequence.

Therefor np.concatenate((a1,a2),axis=0) fails as a1 is of shape (3,) and a2 is of shape (3,1), while a1 and a3 are both (3,).

You could solve it by:

np.concatenate((a1,np.reshape(a2,a1.shape)))
np.concatenate((np.reshape(a1,a2.shape),a2))

Both are valid, each has different results, there is no 1 solution, as the concatenation between a1 and a2 is ambiguous.

Sign up to request clarification or add additional context in comments.

Comments

1

From the numpy doc: In particular, arrays will be stacked along their last axis after being upgraded to at least 2-D with 1’s post-pended to the shape (column vectors made out of 1-D arrays). This means np.c_ will first converts a 1D array to 2D array and then concatenate along the second axis. Here is what happens:

In case of np.c_[a1,a2]:

  1. convert a1 to [[1],[2],[3]]
  2. stack [[1],[2],[3]] to [["A"],["D"],["R"]] along second axis result in:

    [[1 'A']
     [2 'D']
     [3 'R']]
    

In case of np.c_[a1,a3]:

  1. convert a1 to [[1],[2],[3]]
  2. Tries to stack [[1],[2],[3]] to [["A","F"],["D"],["R"]] along the second axis. However, numpy arrays have to be rectangular and a3 is not. The alternative is to interpret each list as single item and stack like following to make the array rectangular shape (3,2):

    [[1 list(['A', 'F'])]
     [2 list(['D'])]
     [3 list(['R'])]]
    

Depending on how you would like the output be, there are different ways. If you want to simply concatenate a mix of 1D/2D arrays into 1D, you can first squeeze them (remove dimension with size 1) and concatenate like this:

np.concatenate((np.squeeze(a1),np.squeeze(a2)),axis=0)
#[1 2 3 'A' 'D' 'R']
np.concatenate((np.squeeze(a1),np.squeeze(a3)),axis=0)
#[1 2 3 list(['A', 'F']) list(['D']) list(['R'])]

You can also hstack them to concatenate the contents of all inside lists:

np.concatenate((np.hstack(a1),np.hstack(a2)),axis=0)
#[1 2 3 'A' 'D' 'R']
np.concatenate((np.hstack(a1),np.hstack(a3)),axis=0)
#['1' '2' '3' 'A' 'F' 'D' 'R']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.