The basic issue occurs because random.shuffle uses the following (code can be found here) -
x[i], x[j] = x[j], x[i]
If you do this kind of assignment for Numpy array (like in your case) , you get the issue -
In [41]: ll
Out[41]:
array([[7, 8],
[5, 6],
[1, 2],
[3, 4]])
In [42]: ll[0] , ll[1] = ll[1] , ll[0]
In [43]: ll
Out[43]:
array([[5, 6],
[5, 6],
[1, 2],
[3, 4]])
The following example may be able to show why the issue occurs -
In [63]: ll = np.array([[1,2],[3,4],[5,6],[7,8]])
In [64]: ll[0]
Out[64]: array([1, 2])
In [65]: x = ll[0]
In [66]: x
Out[66]: array([1, 2])
In [67]: y = ll[1]
In [68]: y
Out[68]: array([3, 4])
In [69]: ll[1] = x
In [70]: y
Out[70]: array([1, 2])
As you can see when you set ll[1] to a new value, y variable reflected the change as well, this is most probably because numpy might be mutating ll[1] inplace (please note, I am not talking about ll , ll[1] the inner ndarray of ll ) instead of assigning the object referenced by x to ll[1] (like it happens in case of lists) .
As a solution you can use np.random.shuffle() instead of random.shuffle() -
In [71]: ll = np.array([[1,2],[3,4],[5,6],[7,8]])
In [72]: ll
Out[72]:
array([[1, 2],
[3, 4],
[5, 6],
[7, 8]])
In [73]: from numpy.random import shuffle
In [74]: shuffle(ll)
In [75]: ll
Out[75]:
array([[7, 8],
[3, 4],
[1, 2],
[5, 6]])
Please do note, np.random.shuffle() only shuffles elements along the first index of a multi-dimensional array. (Though if random.shuffle() worked , it would have worked like that as well) .