python numpy isin() function return wrong result

Question

I used np.random.choice(datasize, n_train_data) to shuffle dataset and split. As to test dataset:

np.random.seed(99)

dataset_index = np.arange(datasize)
train_index_arr = np.random.choice(dataset_index, n_train_data)
mask = ~np.isin(dataset_index, train_index_arr))
val_index_arr = dataset_index[mask]

However it return wrong result. Please kindly refer to the code below:

idx = np.random.choice(range(1000), 300)
sum(~np.isin(np.arange(1000), idx))
>> 742 # expected result: 700

What am I doing wrong?

Daniel F · Accepted Answer · 2020-02-05 10:24:15Z

2

You need to set replace = False so that the choices you make don't go back into the choice pool

idx = np.random.choice(range(1000), 300, replace = False)
sum(~np.isin(np.arange(1000), idx))
Out[]: 700

answered Feb 5, 2020 at 10:24

Daniel F

14.5k2 gold badges34 silver badges59 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

python numpy isin() function return wrong result

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related