If we have an numpy array a that needs to be sampled with replacement to create a second numpy array b,
import numpy as np
a = np.arange(10, 200*1000)
b = np.random.choice(a, len(a), replace=True)
What is the most efficient way to find an array of indexes named mapping that will transform a to b? It is OK to change np.random.choice to a more suitable function.
The following code is too slow and takes 7-8 seconds on a Macbook Pro to creating the mapping array. With an array size of 1 million, it will take much longer.
mapping = np.array([], dtype=np.int)
for n in b:
m = np.searchsorted(a, n)
mapping = np.append(mapping, m)
np.searchsorted()andnp.append()are substitutes for some looping actions. It should, indeed, to be a pain in performance if they are performed on every iteration instead of that.