Boolean indexing assignment of a numpy array to a numpy array

Question

I am seeing some behavior with Boolean indexing that I do not understand, and I was hoping to find some clarification here.

First off, this is the behavior I am seeking...

>>>
>>> a = np.zeros(10, dtype=np.ndarray)
>>> a
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=object)
>>> b = np.arange(10).reshape(2,5)
>>> b
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])
>>> a[5] = b
>>> a
array([0, 0, 0, 0, 0, array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]]), 0,
       0, 0, 0], dtype=object)
>>>

The reason for choosing an ndarray of ndarrays is because I will be appending the arrays stored in the super array, and they will all be of different lengths. I chose the type ndarray instead of list for the super array so I can have access to all of numpys clever indexing features.

anyway if i make a Boolean indexer and use that to assign, say, b+5 at position 1, it does something I didn't expect

>>> indexer = np.zeros(10,dtype='bool')
>>> indexer
array([False, False, False, False, False, False, False, False, False, False], dtype=bool)
>>> indexer[1] = True
>>> indexer
array([False,  True, False, False, False, False, False, False, False, False], dtype=bool)
>>> a[indexer] = b+5
>>> a
array([0, 5, 0, 0, 0, array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]]), 0,
       0, 0, 0], dtype=object)
>>>

Can anyone help me understand what's going on? I would like the result to be

>>> a[1] = b+5
>>> a
array([0, array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]]), 0, 0,
       0, array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]]), 0, 0, 0, 0], dtype=object)
>>>

The final goal is to have a lot of "b" arrays stored in B, and to assign them to a like this

>>> a[indexer] = B[indexer]

EDIT:

found possible work around based on the discussion below. I can wrap my data in a class if i need to

>>>
>>> class myclass:
...     def __init__(self):
...             self.data = np.random.rand(1)
...
>>>
>>> b = myclass()
>>> b
<__main__.myclass object at 0x000002871A4AD198> 
>>> b.data
array([ 0.40185378])
>>>
>>> a[indexer] = b
>>> a
array([None, <__main__.myclass object at 0x000002871A4AD198>, None, None,
       None, None, None, None, None, None], dtype=object)
>>> a[1].data
array([ 0.40185378])

EDIT: this actually fails. I cannot allocate anything to the data field when indexed

it does not :( it fails... but thanks for the info! i will do that in the future — Tsadkiel
– Tsadkiel, Commented Apr 26, 2017 at 23:08

hpaulj · Accepted Answer · 2017-04-26 23:07:05Z

In [203]: a = np.empty(5, object)
In [204]: a
Out[204]: array([None, None, None, None, None], dtype=object)
In [205]: a[3]=np.arange(3)
In [206]: a
Out[206]: array([None, None, None, array([0, 1, 2]), None], dtype=object)

So simple indexing works with this object array.

Boolean indexing works for reading:

In [207]: a[np.array([0,0,0,1,0], dtype=bool)]
Out[207]: array([array([0, 1, 2])], dtype=object)
In [208]: a[np.array([0,0,1,0,0], dtype=bool)]

But has problems when writing:

Out[208]: array([None], dtype=object)
In [209]: a[np.array([0,0,1,0,0], dtype=bool)]=np.arange(2)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-209-c1ef5580972c> in <module>()
----> 1 a[np.array([0,0,1,0,0], dtype=bool)]=np.arange(2)

ValueError: NumPy boolean array indexing assignment cannot assign 2 
input values to the 1 output values where the mask is true

np.where(<boolean>) and [2] also give problems:

In [221]: a[[2]]=np.arange(3)
/usr/local/bin/ipython3:1: DeprecationWarning: assignment will raise an 
error in the future, most likely because your index result shape does 
not match the value array shape. You can use `arr.flat[index] = values`    
to keep the old behaviour.

So whatever reason, indexed assignment to an object dtype array does not work as well as with regular ones.

Even the recommended flat doesn't work

In [226]: a.flat[[2]]=np.arange(3)
In [227]: a
Out[227]: array([None, None, 0, array([0, 1, 2]), None], dtype=object)

I can assign a non-list/array object

In [228]: a[[2]]=None
In [229]: a
Out[229]: array([None, None, None, array([0, 1, 2]), None], dtype=object)
In [230]: a[[2]]={3:4}
In [231]: a
Out[231]: array([None, None, {3: 4}, array([0, 1, 2]), None], dtype=object)
In [232]: idx=np.array([0,0,1,0,0],bool)
In [233]: a[idx]=set([1,2,3])
In [234]: a
Out[234]: array([None, None, {1, 2, 3}, array([0, 1, 2]), None], dtype=object)

object dtype arrays are at the edge of numpy array functionality.

Look at what we get with getitem. With a scalar index we get what object is stored in that slot (in my latest case, a set). But with [[2]] or boolean, we get another object array.

In [235]: a[2]
Out[235]: {1, 2, 3}
In [236]: a[[2]]
Out[236]: array([{1, 2, 3}], dtype=object)
In [237]: a[idx]
Out[237]: array([{1, 2, 3}], dtype=object)
In [238]: a[idx].shape
Out[238]: (1,)

I suspect that when a[idx] is on the LHS, it tries to convert the RHS to an object array first:

Out[241]: array([0, 1, 2], dtype=object)
In [242]: _.shape
Out[242]: (3,)
In [243]: np.array(set([1,2,3]), object)
Out[243]: array({1, 2, 3}, dtype=object)
In [244]: _.shape
Out[244]: ()

In the case of a set the resulting array has a single element and can be put in the (1,) slot. But when the RHS is a list or array the result is a n element array, e.g. (3,), which does not fit in the (1,) slot.

Solution (sort of)

If you want to assign a list/array to a slot in a object array with some form of advanced indexing (boolean or list), first put that item in an object array of the correct size:

In [255]: b=np.empty(1,object)
In [256]: b[0]=np.arange(3)
In [257]: b
Out[257]: array([array([0, 1, 2])], dtype=object)
In [258]: b.shape
Out[258]: (1,)
In [259]: a[idx]=b
In [260]: a
Out[260]: array([None, None, array([0, 1, 2]), array([0, 1, 2]), None], dtype=object)

Or working with your slightly large arrays:

In [264]: a = np.zeros(10, dtype=object)
In [265]: b = np.arange(10).reshape(2,5)
In [266]: a[5] = b
In [267]: c = np.zeros(1, dtype=object)  # intermediate object wrapper
In [268]: c[0] = b+5
In [269]: idx = np.zeros(10,bool)
In [270]: idx[1]=True
In [271]: a[idx] = c
In [272]: a
Out[272]: 
array([0, array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]]), 0, 0,
       0, array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]]), 0, 0, 0, 0], dtype=object)

If idx has n True items, the c has to have shape that will broadcast to (n,)

It looks less buggy when I make sure that the RHS dtype matches the LHS (i.e. object dtype). Then it's just the standard business of broadcastable shapes. It comes back to that old question - how to unambiguously turn a list or array into an object array of known shape.
is it possible to assign and append with this indexing as well? if we had the right shape of C with values in it, does a[idx].append(C[idx]) make any... sense?
a[idx] is an object array, not a list. It does not have an append method. a[2] could be a list, and thus be appendable. You could put a large list or array in c, and then assign that.

Collectives™ on Stack Overflow

Boolean indexing assignment of a numpy array to a numpy array

1 Answer 1

Solution (sort of)

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Solution (sort of)

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related