Find indices of a list of values in a not sorted numpy array

Question

I'm referring to a similar question: Find indices of a list of values in a numpy array

In that case we have a master array that is sorted and another array of which we want to find the index in the master array.

master = np.array([1,2,3,4,5])
search = np.array([4,2,2,3])

The suggested solution was:

>>> master = np.array([1,2,3,4,5])
>>> search = np.array([4,2,2,3])
>>>np.searchsorted(master, search)
array([3, 1, 1, 2])

But what if master is not sorted? for example if i have two arrays like this where the first one is not sorted:

>>>master = np.array([2,3,5,4,1])
>>>search = np.array([3,2,1,4,5])

i get:

>>> np.searchsorted(master, search)
array([1, 0, 0, 2, 5])

But instead i would like:

array([1,0,4,3,2])

i.e. the indices of items in search in master.

How do i get them possibly with a native function of numpy?(not using [np.where(master==i) for i in search] )

Thanks

EDIT: In this case the search array is a permutation of master. Then i would like to find how the index of master are permuted to give a permuted array like search.

As general case, search array contain some item that maybe contained or not in the master such as:

>>>master = np.array([2,3,5,4,1])
>>>search = np.array([1,4,7])

Is this an XY question? Are you just trying to find a permutation of a given array? Because that can be easily done. — Andras Deak -- Слава Україні
– Andras Deak -- Слава Україні, Commented Oct 13, 2016 at 13:09
So do you want to avoid sorting then? The results are not what you expect because the algorithm behind searchsorted assumes the input to be sorted (like in binary search). — rubik
– rubik, Commented Oct 13, 2016 at 13:11
In my specific case search is a permutation of master (then i would mean to find the index of the permutation of master that results in the search array) — claudius_dev
– claudius_dev, Commented Oct 13, 2016 at 13:17

Andras Deak -- Слава Україні · Accepted Answer · 2020-09-29 15:30:54Z

3

Disclaimer: I wrote this answer for an earlier revision of the question. If you want to solve the problem in the appendix (when we aren't just looking for a permutation of an array), see Will's answer.

If all else fails, you need to sort your master array temporarily, then invert the sort order needed for this after matching the elements:

import numpy as np

master = np.array([2,3,5,4,1])
search = np.array([3,2,1,4,5])

# sorting permutation and its reverse
sorti = np.argsort(master)
sorti_inv = np.empty(sorti.shape,dtype=np.int64)
sorti_inv[sorti] = np.arange(sorti.size)

# get indices in sorted version
tmpind = np.searchsorted(master,search,sorter=sorti)

# transform indices back to original array with inverse permutation
final_inds = tmpind[sorti_inv]

The result of the above is correctly

array([1, 0, 4, 3, 2])

As you noted in a comment, your specific search and master are permutations of each other. In this case you can alternatively sort both arrays, and use the inverse permutation combined with the other direct permutation:

sorti = np.argsort(master)
sorti_inv = np.empty(sorti.shape,dtype=np.int64)
sorti_inv[sorti] = np.arange(sorti.size)
sorti_s = np.argsort(search)
final_inds = sorti_s[sorti_inv]

One should consider the effort needed to search two arrays vs searching one array in the sorted version of another. I really can't tell which one's faster.

edited Sep 29, 2020 at 15:30

answered Oct 13, 2016 at 13:16

Andras Deak -- Слава Україні

35.4k13 gold badges94 silver badges118 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Will Over a year ago

The first part of this answer is in general wrong (although it works for the test case provided) - assuming the question is asking for final_inds such that np.array_equal(master[final_inds], search) is True

Andras Deak -- Слава Україні Over a year ago

@Will I'll take a closer look when I get a chance later, thanks. Judging by the timestamps it's possible I hadn't even seen the edit on the question.

Andras Deak -- Слава Україні Over a year ago

@Will sorry, I forgot to follow up on your comment. You're right, as I suspected I hadn't seen the edit on the question. As a simplest solution I've put a disclaimer on top, pointing readers to your answer if they are looking for the more general problem.

Will · Accepted Answer · 2020-06-17 05:09:35Z

Here is an answer to the original question. (The question Edit does not specify what should be returned when search is not a subset of master)

import numpy as np

def get_indices(master, search):

    if not set(search).issubset(set(master)):
        raise ValueError('search must be a subset of master')

    sorti = np.argsort(master)

    # get indices in sorted version
    tmpind = np.searchsorted(master,search,sorter=sorti)

    final_inds = sorti[tmpind]

    return final_inds


master = np.array([3, 4, 5, 6, 1, 9, 0, 2, 7, 8])
search = np.array([6, 4, 3, 1, 1])

final_inds = get_indices(master, search)

assert( np.array_equal(master[final_inds], search) )

The result for final_inds is

array([3, 1, 0, 4, 4])

Collectives™ on Stack Overflow

Find indices of a list of values in a not sorted numpy array

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related