Python: finding index of all elements of array in another including repeating arrays

Question

I have an array A of size 100 which might have repeating elements in it. I have another array B of size 10 which have unique elements in it. All elements of B are present in A and vice versa. I have another array C corresponding to B where each element of C is corresponding to the element in B.

I want to create an array A2 composed of elements of C, such that I can achieve the following:

import numpy as np
A = np.array([1,1,4,5,5,6])
B = np.array([4,6,5,1)])
C = np.array(['A','B','C','D')])

I want to create A2 such that:

A2 = np.array(['D','D','A','C','C','B'])

A2 has elements from C based on matching index of elements of B in A.

Does it have to be numpy? Plain old Python seems more than enough to handle this. — tobias_k
– tobias_k, Commented Jun 10, 2016 at 14:36

tobias_k · Accepted Answer · 2016-06-10 14:39:49Z

1

No need for numpy. Just zip the B and C arrays to a dict and map the values of A:

>>> btoc = dict(zip(B, C))
>>> A2 = np.array(map(btoc.get, A))
>>> A2
array(['D', 'D', 'A', 'C', 'C', 'B'], dtype='|S1')

answered Jun 10, 2016 at 14:39

tobias_k

83.1k12 gold badges130 silver badges186 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Divakar · Accepted Answer · 2016-06-10 17:03:41Z

1

Here's a NumPythonic approach using np.searchsorted -

sidx = B.argsort()
out = C[sidx[np.searchsorted(B,A,sorter = sidx)]]

Sample run -

In [17]: A = np.array([1,1,4,5,5,6])
    ...: B = np.array([4,6,5,1])
    ...: C = np.array(['A','B','C','D'])
    ...: 

In [18]: sidx = B.argsort()

In [19]: C[sidx[np.searchsorted(B,A,sorter = sidx)]]
Out[19]: 
array(['D', 'D', 'A', 'C', 'C', 'B'], 
      dtype='|S1')

answered Jun 10, 2016 at 17:03

Divakar

222k19 gold badges273 silver badges374 bronze badges

Comments

Eelco Hoogendoorn · Accepted Answer · 2016-06-10 17:38:08Z

0

The numpy_indexed package (disclaimer: I am its author) contains functionality to do this in a single call; npi.indices, which is a vectorized equivalent of list.index.

import numpy as np
A = np.array([1,1,4,5,5,6])
B = np.array([4,6,5,1])
C = np.array(['A','B','C','D'])

import numpy_indexed as npi
i = npi.indices(B, A)
print(C[i])

Performance should be similar to the solution of Divakar, since it operates along the same lines; but all wrapped up in a convenient package with tests and all.

answered Jun 10, 2016 at 17:38

Eelco Hoogendoorn

10.8k1 gold badge46 silver badges43 bronze badges

2 Comments

Zanam Over a year ago

pip install numpy_indexed gives error: no module named yaml

Eelco Hoogendoorn Over a year ago

thanks for the feedback, but strange; I thought id fixed that already... running pip install pyyaml first should fix that though.

Collectives™ on Stack Overflow

Python: finding index of all elements of array in another including repeating arrays

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related