You could use np.in1d to find those elements of a which are in b.
To find the index, use one call to np.where:
In [34]: a = array([1,2,3,4,5])
In [35]: b = array([2,4,7])
In [36]: np.in1d(a, b)
Out[38]: array([False, True, False, True, False], dtype=bool)
In [39]: np.where(np.in1d(a, b))
Out[39]: (array([1, 3]),)
Because a and b are already sorted, you could use
In [57]: np.searchsorted(b, a, side='right') != np.searchsorted(b, a, side='left')
Out[57]: array([False, True, False, True, False], dtype=bool)
instead of np.in1d(a, b). For large a and b, using searchsorted may be faster:
import numpy as np
a = np.random.choice(10**7, size=10**6, replace=False)
a.sort()
b = np.random.choice(10**7, size=10**5, replace=False)
b.sort()
In [53]: %timeit np.in1d(a, b)
10 loops, best of 3: 176 ms per loop
In [54]: %timeit np.searchsorted(b, a, side='right') != np.searchsorted(b, a, side='left')
10 loops, best of 3: 106 ms per loop
Jaime and Divakar have suggested some significant improvements on the method shown above. Here is some code which tests that the methods all return the same result, followed by some benchmarks:
import numpy as np
a = np.random.choice(10**7, size=10**6, replace=False)
a.sort()
b = np.random.choice(10**7, size=10**5, replace=False)
b.sort()
def using_searchsorted(a, b):
return (np.where(np.searchsorted(b, a, side='right')
!= np.searchsorted(b, a, side='left')))[0]
def using_in1d(a, b):
return np.where(np.in1d(a, b))[0]
def using_searchsorted_divakar(a, b):
idx1 = np.searchsorted(a,b,'left')
idx2 = np.searchsorted(a,b,'right')
out = idx1[idx1 != idx2]
return out
def using_jaime_mask(haystack, needle):
idx = np.searchsorted(haystack, needle)
mask = idx < haystack.size
mask[mask] = haystack[idx[mask]] == needle[mask]
idx = idx[mask]
return idx
expected = using_searchsorted(a, b)
for func in (using_in1d, using_searchsorted_divakar, using_jaime_mask):
result = func(a, b)
assert np.allclose(expected, result)
In [29]: %timeit using_jaime_mask(a, b)
100 loops, best of 3: 13 ms per loop
In [28]: %timeit using_searchsorted_divakar(a, b)
10 loops, best of 3: 21.7 ms per loop
In [26]: %timeit using_searchsorted(a, b)
10 loops, best of 3: 109 ms per loop
In [27]: %timeit using_in1d(a, b)
10 loops, best of 3: 173 ms per loop