6

I'm referring to a similar question: Find indices of a list of values in a numpy array

In that case we have a master array that is sorted and another array of which we want to find the index in the master array.

master = np.array([1,2,3,4,5])
search = np.array([4,2,2,3])

The suggested solution was:

>>> master = np.array([1,2,3,4,5])
>>> search = np.array([4,2,2,3])
>>>np.searchsorted(master, search)
array([3, 1, 1, 2])

But what if master is not sorted? for example if i have two arrays like this where the first one is not sorted:

>>>master = np.array([2,3,5,4,1])
>>>search = np.array([3,2,1,4,5])

i get:

>>> np.searchsorted(master, search)
array([1, 0, 0, 2, 5])

But instead i would like:

array([1,0,4,3,2])

i.e. the indices of items in search in master.

How do i get them possibly with a native function of numpy?(not using [np.where(master==i) for i in search] )

Thanks

EDIT: In this case the search array is a permutation of master. Then i would like to find how the index of master are permuted to give a permuted array like search.

As general case, search array contain some item that maybe contained or not in the master such as:

>>>master = np.array([2,3,5,4,1])
>>>search = np.array([1,4,7])
3
  • Is this an XY question? Are you just trying to find a permutation of a given array? Because that can be easily done. Commented Oct 13, 2016 at 13:09
  • So do you want to avoid sorting then? The results are not what you expect because the algorithm behind searchsorted assumes the input to be sorted (like in binary search). Commented Oct 13, 2016 at 13:11
  • In my specific case search is a permutation of master (then i would mean to find the index of the permutation of master that results in the search array) Commented Oct 13, 2016 at 13:17

2 Answers 2

3

Disclaimer: I wrote this answer for an earlier revision of the question. If you want to solve the problem in the appendix (when we aren't just looking for a permutation of an array), see Will's answer.

If all else fails, you need to sort your master array temporarily, then invert the sort order needed for this after matching the elements:

import numpy as np

master = np.array([2,3,5,4,1])
search = np.array([3,2,1,4,5])

# sorting permutation and its reverse
sorti = np.argsort(master)
sorti_inv = np.empty(sorti.shape,dtype=np.int64)
sorti_inv[sorti] = np.arange(sorti.size)

# get indices in sorted version
tmpind = np.searchsorted(master,search,sorter=sorti)

# transform indices back to original array with inverse permutation
final_inds = tmpind[sorti_inv]

The result of the above is correctly

array([1, 0, 4, 3, 2])

As you noted in a comment, your specific search and master are permutations of each other. In this case you can alternatively sort both arrays, and use the inverse permutation combined with the other direct permutation:

sorti = np.argsort(master)
sorti_inv = np.empty(sorti.shape,dtype=np.int64)
sorti_inv[sorti] = np.arange(sorti.size)
sorti_s = np.argsort(search)
final_inds = sorti_s[sorti_inv]

One should consider the effort needed to search two arrays vs searching one array in the sorted version of another. I really can't tell which one's faster.

Sign up to request clarification or add additional context in comments.

3 Comments

The first part of this answer is in general wrong (although it works for the test case provided) - assuming the question is asking for final_inds such that np.array_equal(master[final_inds], search) is True
@Will I'll take a closer look when I get a chance later, thanks. Judging by the timestamps it's possible I hadn't even seen the edit on the question.
@Will sorry, I forgot to follow up on your comment. You're right, as I suspected I hadn't seen the edit on the question. As a simplest solution I've put a disclaimer on top, pointing readers to your answer if they are looking for the more general problem.
2

Here is an answer to the original question. (The question Edit does not specify what should be returned when search is not a subset of master)

import numpy as np

def get_indices(master, search):

    if not set(search).issubset(set(master)):
        raise ValueError('search must be a subset of master')

    sorti = np.argsort(master)

    # get indices in sorted version
    tmpind = np.searchsorted(master,search,sorter=sorti)

    final_inds = sorti[tmpind]

    return final_inds


master = np.array([3, 4, 5, 6, 1, 9, 0, 2, 7, 8])
search = np.array([6, 4, 3, 1, 1])

final_inds = get_indices(master, search)

assert( np.array_equal(master[final_inds], search) )

The result for final_inds is

array([3, 1, 0, 4, 4])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.