4

I have been playing around with the random state variable from StratifiedKFold in sklearn, but it does not seem to be random. I believe that setting random_state=5, should give me a different testing set then setting random_state=4, but this does not seem to be the case. I have created some crude reproducible code below. First I load my data:

import numpy as np
from sklearn.cross_validation import StratifiedKFold
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target

Then I set random_state=5, for which I store the last values:

skf=StratifiedKFold(n_splits=5,random_state=5)
for (train, test) in skf.split(X,y): full_test_1=test
full_test_1

array([ 40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  90,  91,  92,
        93,  94,  95,  96,  97,  98,  99, 140, 141, 142, 143, 144, 145,
       146, 147, 148, 149])

Doing the same procedure for random_state=4:

skf=StratifiedKFold(n_splits=5,random_state=4)
for (train, test) in skf.split(X,y): full_test_2=test
full_test_2

array([ 40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  90,  91,  92,
        93,  94,  95,  96,  97,  98,  99, 140, 141, 142, 143, 144, 145,
       146, 147, 148, 149])

I can then check that they are equal:

np.array_equal(full_test_1,full_test_2)
True

I do not think that the two random states should be returning the same numbers. Is there a flaw in my logic or code?

1 Answer 1

4

From the linked docs

random_state : None, int or RandomState

When shuffle=True, pseudo-random number generator state used for shuffling. If None, use default numpy RNG for shuffling.

You aren't setting shuffle=True in your call to StratifiedKFold, so random_state won't do anything.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.