2

I want to generate a fixed number of random column indexes (without replacement) for each row of a numpy array.

A = np.array([[3, 5, 2, 3, 3],
       [1, 3, 3, 4, 5],
       [3, 5, 4, 2, 1],
       [1, 2, 3, 5, 3]])

If I fixed the required column number to 2, I want something like

np.array([[1,3],
          [0,4],
          [1,4],
          [2,3]])

I am looking for a non-loop Numpy based solution. I tried with choice, but with the replacement=False I get error

ValueError: Cannot take a larger sample than population when 'replace=False'

3
  • I can't relate your desired result to the original array. What code produced the choice error? Obviously you can't choice 10 items without replacement from a population of 6. Are you trying to select a random 2 items from the 1st row, another random 2 from 2nd, and so on? Commented Jul 11, 2018 at 7:36
  • @hpaulj if I do random.randint(A.shape[1], size=(A.shape[0],2)), to select 2 random column indexes for each row I get rows with duplicate entries. and with replace=False, I get error. Commented Jul 11, 2018 at 8:02
  • OP wants random indices but it seems that the rows should be unique. Commented Jul 11, 2018 at 8:06

3 Answers 3

3

Here's one vectorized approach inspired by this post -

def random_unique_indexes_per_row(A, N=2):
    m,n = A.shape
    return np.random.rand(m,n).argsort(1)[:,:N]

Sample run -

In [146]: A
Out[146]: 
array([[3, 5, 2, 3, 3],
       [1, 3, 3, 4, 5],
       [3, 5, 4, 2, 1],
       [1, 2, 3, 5, 3]])

In [147]: random_unique_indexes_per_row(A, N=2)
Out[147]: 
array([[4, 0],
       [0, 1],
       [3, 2],
       [2, 0]])
In [148]: random_unique_indexes_per_row(A, N=3)
Out[148]: 
array([[2, 0, 1],
       [3, 4, 2],
       [3, 2, 1],
       [4, 3, 0]])
Sign up to request clarification or add additional context in comments.

Comments

1

Like this?

B = np.random.randint(5, size=(len(A), 2))

1 Comment

Welcome to Stack Overflow! Please don't answer just with source code. Try to provide a nice description about how your solution works. See: How do I write a good answer?. Thanks
0

You can use random.choice() as following:

def random_indices(arr, n):
    x, y = arr.shape
    return np.random.choice(np.arange(y), (x, n))
    # or return np.random.randint(low=0, high=y, size=(x, n))

Demo:

In [34]: x, y = A.shape

In [35]: np.random.choice(np.arange(y), (x, 2))
Out[35]: 
array([[0, 2],
       [0, 1],
       [0, 1],
       [3, 1]])

As an experimental approach here is a way that in 99% of the times will give unique indices:

In [60]: def random_ind(arr, n):
    ...:     x, y = arr.shape
    ...:     ind = np.random.randint(low=0, high=y, size=(x * 2, n))
    ...:     _, index = np.unique(ind.dot(np.random.rand(ind.shape[1])), return_index=True)
    ...:     return ind[index][:4]
    ...: 
    ...: 
    ...: 

In [61]: random_ind(A, 2)
Out[61]: 
array([[0, 1],
       [1, 0],
       [1, 1],
       [1, 4]])

In [62]: random_ind(A, 2)
Out[62]: 
array([[1, 0],
       [2, 0],
       [2, 1],
       [3, 1]])

In [64]: random_ind(A, 3)
Out[64]: 
array([[0, 0, 0],
       [1, 1, 2],
       [0, 4, 1],
       [2, 3, 1]])

In [65]: random_ind(A, 4)
Out[65]: 
array([[0, 4, 0, 3],
       [1, 0, 1, 4],
       [0, 4, 1, 2],
       [3, 0, 1, 0]])

This function will return IndexError at line return ind[index][:4] if there's no 4 unique items in that case you can repeat the function to make sure you'll get the desire result.

2 Comments

But it seems OP wants without replacement.
@Divakar It seems so, however I gave a solution for unique rows but not unique items in each row 0_0.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.