I have two numpy arrays, x and y (the length are around 2M). The x are ordered, but some of the values are identical.
The task is to remove values for both x and y when the values in x are identical. My idea is to create a mask. Here is what I have done so far:
def createMask(x):
idx = np.empty(x.shape, dtype=bool)
for i in xrange(len(x)-1):
if x[i+1] == x[i]:
idx[i] = False
return idx
idx = createMask(x)
x = x[idx]
y = y[idx]
This method works fine, but it is slow (705ms with %timeit). Also I think this look really clumpsy. Is there are more elegant and efficient way (I'm sure there is).
Updated with best answer
The second method is
idx = [x[i+1] == x[i] for i in xrange(len(x)-1)]
And the third (and fastest) method is
idx = x[:-1] == x[1:]
The results are (using ipython's %timeit):
First method: 751ms
Second method: 618ms
Third method: 3.63ms
Credit to mtitan8 for both methods.