4

I noticed the following unittest.TestCase assertion failing and am wondering how to correctly compare empty recarrays:

fails:

self.assertEqual(
    np.array(
        [],
        dtype=[
            ('time', 'datetime64[ns]'),
            ('end_time', int)
        ]
    ).view(np.recarray),
    np.array(
        [],
        dtype=[
            ('time', 'datetime64[ns]'),
            ('end_time', int)
        ]
    ).view(np.recarray)
)

passes:

self.assertEqual(
    np.array(
        [(1,1)],
        dtype=[
            ('time', 'datetime64[ns]'),
            ('end_time', int)
        ]
    ).view(np.recarray),
    np.array(
        [(1,1)],
        dtype=[
            ('time', 'datetime64[ns]'),
            ('end_time', int)
        ]
    ).view(np.recarray)
)

Is this a bug or am I doing something wrong here?

2 Answers 2

5
+100

I can only assume unittest.TestCase.assertEqual uses the __eq__ method, which in numpy.ndarray objects, does elementwise equality. Thus, using == on two empty arrays returns an empty boolean array, which is falsy:

>>> arr1
rec.array([],
          dtype=[('time', '<M8[ns]'), ('end_time', '<i8')])
>>> arr2
rec.array([],
          dtype=[('time', '<M8[ns]'), ('end_time', '<i8')])
>>> bool(arr1 == arr2)
False

Now, in your second case, you are dealing with another special case, that is, an array of shape (1,), which is the result of elementwise equality on two record-arrays with a single element. Essentially, in the case of an array with a single item, the truthiness is whatever the truthiness of the element is:

>>> bool(np.array([1]))
True
>>> bool(np.array([0]))
False
>>> bool(np.array([{}]))
False
>>> bool(np.array([{'a':1}]))
True
>>> bool(np.array([object()]))
True

So, with your arrays:

>>> arr3 = np.array(
...         [(1,1)],
...         dtype=[
...             ('time', 'datetime64[ns]'),
...             ('end_time', int)
...         ]
...     ).view(np.recarray)
>>> arr4 = np.array(
...         [(1,1)],
...         dtype=[
...             ('time', 'datetime64[ns]'),
...             ('end_time', int)
...         ]
...     ).view(np.recarray)
>>> arr3.size, arr4.size
(1, 1)
>>> arr3 == arr4
rec.array([ True],
          dtype=bool)
>>> bool(arr3 == arr4)
True

Note, in any case where the resulting array has a .size greater than 1, then you will get this infamous error if you try to evaluate the truth value, so:

>>> np.array([1, 1]) == np.array([1, 1])
array([ True,  True], dtype=bool)
>>> bool(np.array([1, 1]) == np.array([1, 1]))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
>>>
Sign up to request clarification or add additional context in comments.

2 Comments

That seems to be it! The next text case I started on gave exactly that error and needed a call for tolist()... Although looks like a better way to go is the numpy.testing module now that @Grr mentioned. Thanks!
"Thus, using == on two empty arrays returns an empty boolean array, which is falsy:" - as of 1.14, trying to convert an empty array to a bool gives a warning
4

@juanpa.arrivillaga is correct. But in addition you should note that it is best to do testing on NumPy arrays using the numpy.testing module. For Example:

np.testing.assert_equal(
    np.array(
        [],
        dtype=[
            ('time', 'datetime64[ns]'),
            ('end_time', int)
        ]
    ).view(np.recarray),
    np.array(
        [],
        dtype=[
            ('time', 'datetime64[ns]'),
            ('end_time', int)
        ]
    ).view(np.recarray)
)

1 Comment

nice! Heck, I'd use it just for the camel_case over... :shudders: snakeCase...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.