Check if each element in a numpy array is in another array

Question

This problem seems easy but I cannot quite get a nice-looking solution. I have two numpy arrays (A and B), and I want to get the indices of A where the elements of A are in B and also get the indices of A where the elements are not in B.

So, if

A = np.array([1,2,3,4,5,6,7])
B = np.array([2,4,6])

Currently I am using

C = np.searchsorted(A,B)

which takes advantage of the fact that A is in order, and gives me [1, 3, 5], the indices of the elements that are in A. This is great, but how do I get D = [0,2,4,6], the indices of elements of A that are not in B?

ford · Accepted Answer · 2014-02-11 21:09:50Z

44

searchsorted may give you wrong answer if not every element of B is in A. You can use numpy.in1d:

A = np.array([1,2,3,4,5,6,7])
B = np.array([2,4,6,8])
mask = np.in1d(A, B)
print np.where(mask)[0]
print np.where(~mask)[0]

output is:

[1 3 5]
[0 2 4 6]

However in1d() uses sort, which is slow for large datasets. You can use pandas if your dataset is large:

import pandas as pd
np.where(pd.Index(pd.unique(B)).get_indexer(A) >= 0)[0]

Here is the time comparison:

A = np.random.randint(0, 1000, 10000)
B = np.random.randint(0, 1000, 10000)

%timeit np.where(np.in1d(A, B))[0]
%timeit np.where(pd.Index(pd.unique(B)).get_indexer(A) >= 0)[0]

output:

100 loops, best of 3: 2.09 ms per loop
1000 loops, best of 3: 594 µs per loop

edited Feb 11, 2014 at 21:09

ford

12k4 gold badges50 silver badges58 bronze badges

answered Apr 11, 2013 at 3:51

HYRY

97.8k28 gold badges197 silver badges192 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

DanHickstein Over a year ago

It's good to know about this efficient method because my datasets are very large. Thanks so much for this solution!

askewchan · Accepted Answer · 2013-04-11 02:40:26Z

8

import numpy as np

A = np.array([1,2,3,4,5,6,7])
B = np.array([2,4,6])
C = np.searchsorted(A, B)

D = np.delete(np.arange(np.alen(A)), C)

D
#array([0, 2, 4, 6])

answered Apr 11, 2013 at 2:40

askewchan

46.7k18 gold badges125 silver badges135 bronze badges

2 Comments

DanHickstein Over a year ago

Thanks! I also like the answer provided by alexhb using np.setdiff1d. I was hoping that there was a function that would give me the indices directly, but this works just fine.

askewchan Over a year ago

There might be, @Dan, but I can't think of it. If you don't need C, use his solution, but mine will be twice as fast if you've already got C.

alexhb · Accepted Answer · 2013-04-11 02:48:04Z

7

import numpy as np

a = np.array([1, 2, 3, 4, 5, 6, 7])
b = np.array([2, 4, 6])
c = np.searchsorted(a, b)
d = np.searchsorted(a, np.setdiff1d(a, b))

d
#array([0, 2, 4, 6])

answered Apr 11, 2013 at 2:48

alexhb

4452 silver badges12 bronze badges

2 Comments

askewchan Over a year ago

Having to search twice slows this down a bit, better to use the already known C to get D. But, this is of course the better solution if C is not needed, so +1. (Welcome to Stack Overflow!)

Ricardo Decal Over a year ago

should the c line be deleted? it is not doing anything here

Community · Accepted Answer · 2020-06-20 09:12:55Z

6

The elements of A that are also in B:

set(A) & set(B)

The elements of A that are not in B:

set(A) - set(B)

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered May 11, 2017 at 15:22

Ben Zweig

691 silver badge2 bronze badges

2 Comments

Nerxis Over a year ago

This does not answer the question (to get indexes, not elements). However, if you want to perform above operation for numpy, do not convert it to set, but use numpy operations instead. See intersect1d and setdiff1d (or eventually setxor1d).

PhasorLaser Over a year ago

Thank you, as I was looking for elements not indices and the question title is ambiguous. I appreciate the numpy operations as well.

Ricardo Decal · Accepted Answer · 2023-03-20 16:52:48Z

0

all_vals = np.arange(1000)  # `A` in the question
seen_vals = np.unique(np.random.randint(0, 1000, 100))  # `B` in the question
# indices of unseen values
mask = np.isin(all_vals, seen_vals, invert=True)  # `D` in the original question
unseen_vals = all_vals[mask]

answered Mar 20, 2023 at 16:52

Ricardo Decal

20.3k9 gold badges61 silver badges83 bronze badges

Collectives™ on Stack Overflow

Check if each element in a numpy array is in another array

5 Answers 5

1 Comment

2 Comments

2 Comments

The elements of A that are also in B:

The elements of A that are not in B:

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

2 Comments

2 Comments

The elements of A that are also in B:

The elements of A that are not in B:

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related