2

I have a multidimensional ndarray and I'm looking to randomly select 1000 arrays WITH replacement. This seem to me to be simple, but the with replacement part I'm struggling to incorporate.

There are 3065 arrays in this ndarray.

np.shape(train_spam)
(3065L, 58L)

I tried to use np.random.shuffle() but this does not take into account the with replacement.

np.random.shuffle(train_spam)
X_train = train_spam[:1000,1:57]

My final output would have ea shape of (1000L, 58L).

I suppose I could run a loop with a ndarray with

X_train = train_spam[0:57]

and then append but I can't figure out how to append correctly, so it looks the same. Any help would be greatly appreciated

6
  • Is this problem specific to numpy? If all you need is a general way to select random arrays with replacement, I suggest selection = [arrays[random.randrange(n)][:] for i in range(k)] where n is the size of arrays and k is the number of elements you want to select with replacement. Commented Nov 23, 2014 at 21:53
  • If that suits your needs, please let me know and I'll post a formal answer. Commented Nov 23, 2014 at 21:54
  • actually did not work for me, i'm trying to get better at numpy though, to answer your question Commented Nov 23, 2014 at 22:00
  • I took a look at the numpy docs a minute ago. There's an ndarray method called choose(). Does this work for you? train_spam.choose([random.randrange(3065) for i in range(1000)]) Commented Nov 23, 2014 at 22:06
  • thanks for the help, i got this error: TypeError: Cannot cast array data from dtype('float64') to dtype('int64') according to the rule 'safe' Commented Nov 23, 2014 at 22:09

3 Answers 3

3

You could use

selected = train_spam[np.random.randint(train_spam.shape[0], size=1000)]
Sign up to request clarification or add additional context in comments.

1 Comment

The 0 argument is unnecessary though. np.random.randint(train_spam.shape[0], size=1000) will do the same.
0

You can also build a list of indices with [random.randrange(n) for i in range(k)].

k = 1000                                           # Number of elements to select.
n = train_spam.shape[0]                            # Number of elements in array.
indices = [random.randrange(n) for i in range(k)]  # A plain Python list.
selected = train_spam[np.array(indices)]           # Convert indices to ndarray.

If you have a plain Python list from which you want to select elements with replacement, you can do this:

pets = ['ant', 'bear', 'cat', 'dog', 'elephant', 'flamingo', 'gorilla', 'horse']
n = len(pets)
k = 10
selected = [pets[random.randrange(n)] for i in range(k)]

Comments

0

You can obtain multiple arrays randomly from ndarray, Using Random Generator.

a = np.array([[1,2,3],[2,3,4], [5,6,7]])
rng = np.random.default_rng()
rng.choice(a, 2)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.