Check if 2d array exists in 3d array in Python?

Question

I have an 3d array with shape (1000, 12, 30), and I have a list of 2d array's of shape (12, 30), what I want to do is check if these 2d arrays exist in the 3d array. Is there a simple way in Python to do this? I tried keyword in but it doesn't work.

The solution here should apply to your problem stackoverflow.com/questions/7100242/…. Marking duplicate — Xero Smith
– Xero Smith, Commented May 3, 2018 at 3:02
The solutions apply to this case. Adjust the rolling window accordingly — Xero Smith
– Xero Smith, Commented May 3, 2018 at 3:05
It doesn't apply. Questions that are a duplicate should be marked as a duplicate, those are not the same questions. I don't understand why you would mark this as a duplicate. — Teodorico Levoff
– Teodorico Levoff, Commented May 3, 2018 at 3:06

Rémy Hosseinkhan Boucher · Accepted Answer · 2020-02-06 22:32:24Z

5

There is a way in numpy , you can do with np.all

a = np.random.rand(3, 1, 2)
b = a[1][0]
np.all(np.all(a == b, 1), 1)
Out[612]: array([False,  True, False])

Solution from bnaecker

np.all(a == b, axis=(1, 2))

If only want to check exit or not

np.any(np.all(a == b, axis=(1, 2)))

edited Feb 6, 2020 at 22:32

Rémy Hosseinkhan Boucher

1902 silver badges10 bronze badges

answered May 3, 2018 at 3:05

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

bnaecker Over a year ago

Or better yet, np.all(a == b, axis=(1,2)).

Teodorico Levoff Over a year ago

@Wen I see! thanks for this. I'm still not sure how this would work if the depth is 30 like what I mentioned in the question?

bnaecker Over a year ago

@TeodoricoLevoff Check out NumPy's broadcasting rules. b in this case will be broadcast (replicated) along the first dimension to match a. Then the axis arguments to np.all reduce that along the last two dimensions, leaving a boolean array of shape (30,) with True at indices i where a[i] == b.

bnaecker Over a year ago

@TeodoricoLevoff Also note, that you might need to use np.allclose() rather than np.all() if you're dealing with floating point numbers.

Teodorico Levoff Over a year ago

@bnaecker I understand. But I want to return True only if the complete (12, 30) array exist in the (1000, 12, 30). I think the solution mentioned above checks each single value in the 30 lists and outputs a boolean for each?

|

Paul Panzer · Accepted Answer · 2018-05-04 06:29:25Z

3

Here is a fast method (previously used by @DanielF as well as @jaime and others, no doubt) that uses a trick to benefit from short-circuiting: view-cast template-sized blocks to single elements of dtype void. When comparing two such blocks numpy stops after the first difference, yielding a huge speed advantage.

>>> def in_(data, template):
...     dv = data.reshape(data.shape[0], -1).view(f'V{data.dtype.itemsize*np.prod(data.shape[1:])}').ravel()
...     tv = template.ravel().view(f'V{template.dtype.itemsize*template.size}').reshape(())
...     return (dv==tv).any()

Example:

>>> a = np.random.randint(0, 100, (1000, 12, 30))
>>> check = a[np.random.randint(0, 1000, (10,))]
>>> check += np.random.random(check.shape) < 0.001    
>>>
>>> [in_(a, c) for c in check]
[True, True, True, False, False, True, True, True, True, False]
# compare to other method
>>> (a==check[:, None]).all((-1,-2)).any(-1)
array([ True,  True,  True, False, False,  True,  True,  True,  True,
       False])

Gives same result as "direct" numpy approach, but is almost 20x faster:

>>> from timeit import timeit
>>> kwds = dict(globals=globals(), number=100)
>>> 
>>> timeit("(a==check[:, None]).all((-1,-2)).any(-1)", **kwds)
0.4793281531892717
>>> timeit("[in_(a, c) for c in check]", **kwds)
0.026218891143798828

edited May 4, 2018 at 6:29

answered May 3, 2018 at 3:45

Paul Panzer

53.3k3 gold badges59 silver badges103 bronze badges

4 Comments

Daniel F Over a year ago

I was hoping someone would who was better at actual coding would eventually improve my old vview code. Once you have the void view couldn't you just use np.in1d though?

Paul Panzer Over a year ago

@DanielF You are right, that should be even faster. Could you give me a pointer to your post so I can properly credit you?

Paul Panzer Over a year ago

@DanielF Strange, I tried with in1d or rather the new isin and it is 10x slower. Not sure what's going on here.

Daniel F Over a year ago

I've given answers with it a few times: here and here most recently. But the original idea came from @jaime here

piRSquared · Accepted Answer · 2018-05-03 03:35:18Z

Numpy

Given

a = np.arange(12).reshape(3, 2, 2)
lst = [
    np.arange(4).reshape(2, 2),
    np.arange(4, 8).reshape(2, 2)
]

print(a, *lst, sep='\n{}\n'.format('-' * 20))

[[[ 0  1]
  [ 2  3]]

 [[ 4  5]
  [ 6  7]]

 [[ 8  9]
  [10 11]]]
--------------------
[[0 1]
 [2 3]]
--------------------
[[4 5]
 [6 7]]

Notice that lst is a list of arrays as per OP. I'll make that a 3d array b below.

Use broadcasting. Using the broadcasting rules. I want the dimensions of a as (1, 3, 2, 2) and b as (2, 1, 2, 2).

b = np.array(lst)
x, *y = b.shape
c = np.equal(
    a.reshape(1, *a.shape),
    np.array(lst).reshape(x, 1, *y)
)

I'll use all to produce a (2, 3) array of truth values and np.where to find out which among the a and b sub-arrays are actually equal.

i, j = np.where(c.all((-2, -1)))

This is just a verification that we achieved what we were after. We are supposed to observe that for each paired i and j values, the sub-arrays are actually the same.

for t in zip(i, j):
    print(a[t[0]], b[t[1]], sep='\n\n')
    print('------')

[[0 1]
 [2 3]]

[[0 1]
 [2 3]]
------
[[4 5]
 [6 7]]

[[4 5]
 [6 7]]
------

`in`

However, to complete OP's thought on using in

a_ = a.tolist()
list(filter(lambda x: x.tolist() in a_, lst))

[array([[0, 1],
        [2, 3]]), array([[4, 5],
        [6, 7]])]

Collectives™ on Stack Overflow

Check if 2d array exists in 3d array in Python?

3 Answers 3

7 Comments

4 Comments

Numpy

`in`

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

4 Comments

Numpy

in

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related

`in`