30

Is it possible to modify the numpy.random.choice function in order to make it return the index of the chosen element? Basically, I want to create a list and select elements randomly without replacement

import numpy as np
>>> a = [1,4,1,3,3,2,1,4]
>>> np.random.choice(a)
>>> 4
>>> a
>>> [1,4,1,3,3,2,1,4]

a.remove(np.random.choice(a)) will remove the first element of the list with that value it encounters (a[1] in the example above), which may not be the chosen element (eg, a[7]).

4
  • 2
    It may not be the chosen element, but it seems like two cases are indistinguishable. Commented Sep 13, 2013 at 20:07
  • enumerate would probably work Commented Sep 13, 2013 at 20:07
  • @Rob: Not really. After I create the list it's important that it remains in the same order, whichever element I remove. Commented Sep 13, 2013 at 20:08
  • 1
    ... there should be a function np.random.argchoice(...) Commented Nov 7, 2017 at 12:54

9 Answers 9

19

Regarding your first question, you can work the other way around, randomly choose from the index of the array a and then fetch the value.

>>> a = [1,4,1,3,3,2,1,4]
>>> a = np.array(a)
>>> random.choice(arange(a.size))
6
>>> a[6]

But if you just need random sample without replacement, replace=False will do. Can't remember when it was firstly added to random.choice, might be 1.7.0. So if you are running very old numpy it may not work. Keep in mind the default is replace=True

Sign up to request clarification or add additional context in comments.

6 Comments

No need to make a list and choose from it in this case, just do np.random.randint(0,a.size), unless I suppose many mutually exclusive choices are needed.
@askwchan, right! What was I thinking. np.random.randint(0,a.size, size=size_you_want) will be enough.
@CT Zhu: I get a AttributeError: 'list' object has no attribute 'size'
Oh, a is a list, not a array. Put convert it to array first. I forgot to copy 1 line.
@askwchan, oh, no. Your method will always become sampling with replacement. HappyPy really needs that replace=False, so a once a element is sampled it will not sampled again.
|
15

Here's one way to find out the index of a randomly selected element:

import random # plain random module, not numpy's
random.choice(list(enumerate(a)))[0]
=> 4      # just an example, index is 4

Or you could retrieve the element and the index in a single step:

random.choice(list(enumerate(a)))
=> (1, 4) # just an example, index is 1 and element is 4

9 Comments

This is not working for me. It gives me a "ValueError: a must be 1-dimensional"
I copy/pasted the your code and the list above, and I still get the same error. Is it working with you?
list(enumerate(a)) produces a list of tuples, which is considered a 2D array-like object. This won't work.
@HappyPy you're right, I tested it with random.choice, not np.random.choice. If you must absolutely use np.random.choice, then my answer won't work and I'll delete it. But if you use plain old random.choice (from the random module), it'll work.
Strong warning, this is going to have terrible performance, which is one of the primary reasons people use numpy in the first place. You're iterating over an entire array. It would be cheaper to just generate a random integer between 0 and the length of the list rather than this.
|
10
numpy.random.choice(a, size=however_many, replace=False)

If you want a sample without replacement, just ask numpy to make you one. Don't loop and draw items repeatedly. That'll produce bloated code and horrible performance.

Example:

>>> a = numpy.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> numpy.random.choice(a, size=5, replace=False)
array([7, 5, 8, 6, 2])

On a sufficiently recent NumPy (at least 1.17), you should use the new randomness API, which fixes a longstanding performance issue where the old API's replace=False code path unnecessarily generated a complete permutation of the input under the hood:

rng = numpy.random.default_rng()
result = rng.choice(a, size=however_many, replace=False)

4 Comments

I don't understand how would this work. What's "a" in this case? Could you provide an example please?
@HappyPy: a is exactly the same thing it is in your code; it's the array-like object we want a sample from. size is the number of elements we want in the sample, and replace=False asks for a sample without replacement. The result will be a 1D array of shape (however_many,) containing the sample you wanted.
The sample is already "a". I want to work directly with "a" so that I can control how many elements are still left and perform other operations with "a".
@HappyPy: That sounds like you're using numpy all wrong. If a is already a random sample, but you want to draw elements from a without replacement, you're essentially drawing another random sample from a. If you really, really want to successively remove elements from a, numpy is unlikely to help you.
4

This is a bit in left field compared with the other answers, but I thought it might help what it sounds like you're trying to do in a slightly larger sense. You can generate a random sample without replacement by shuffling the indices of the elements in the source array :

source = np.random.randint(0, 100, size=100) # generate a set to sample from
idx = np.arange(len(source))
np.random.shuffle(idx)
subsample = source[idx[:10]]

This will create a sample (here, of size 10) by drawing elements from the source set (here, of size 100) without replacement.

You can interact with the non-selected elements by using the remaining index values, i.e.:

notsampled = source[idx[10:]]

Comments

2

Maybe late but it worth to mention this solution because I think the simplest way to do so is:

a = [1, 4, 1, 3, 3, 2, 1, 4]
n = len(a)
idx = np.random.choice(list(range(n)), p=np.ones(n)/n)

It means you are choosing from the indices uniformly. In a more general case, you can do a weighted sampling (and return the index) in this way:

probs = [.3, .4, .2, 0, .1]
n = len(a)
idx = np.random.choice(list(range(n)), p=probs)

If you try to do so for so many times (e.g. 1e5), the histogram of the chosen indices would be like [0.30126 0.39817 0.19986 0. 0.10071] in this case which is correct.

Anyway, you should choose from the indices and use the values (if you need) as their probabilities.

Comments

1

Instead of using choice, you can also simply random.shuffle your array, i.e.

random.shuffle(a)  # will shuffle a in-place

Comments

1

Here is a simple solution, just choose from the range function.

import numpy as np
a = [100,400,100,300,300,200,100,400]
I=np.random.choice(np.arange(len(a)))
print('index is '+str(I)+' number is '+str(a[I]))

Comments

0

Based on your comment:

The sample is already a. I want to work directly with a so that I can control how many elements are still left and perform other operations with a. – HappyPy

it sounds to me like you're interested in working with a after n randomly selected elements are removed. Instead, why not work with N = len(a) - n randomly selected elements from a? Since you want them to still be in the original order, you can select from indices like in @CTZhu's answer, but then sort them and grab from the original list:

import numpy as np
n = 3 #number to 'remove'
a = np.array([1,4,1,3,3,2,1,4])
i = np.random.choice(np.arange(a.size), a.size-n, replace=False)
i.sort()
a[i]
#array([1, 4, 1, 3, 1])

So now you can save that as a again:

a = a[i]

and work with a with n elements removed.

Comments

0

The question title versus its description are a bit different. I just wanted the answer to the title question which was getting only an (integer) index from numpy.random.choice(). Rather than any of the above, I settled on index = numpy.random.choice(len(array_or_whatever)) (tested in numpy 1.21.6).

Ex:

import numpy
a = [1, 2, 3, 4]
i = numpy.random.choice(len(a))

The problem I had in the other solutions were the unnecessary conversions to list which would recreate the entire collection in a new object (slow!).

Reference: https://numpy.org/doc/stable/reference/random/generated/numpy.random.choice.html?highlight=choice#numpy.random.choice

Key point from the docs about the first parameter a:

a: 1-D array-like or int If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if it were np.arange(a)

Since the question is very old then it's possible I'm coming at this from the convenience of newer versions supporting exactly what myself and the OP wanted.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.