5

I have an array

    [False False False ...  True  True  True]

I want to check if the previous value == current value. In pandas, I can use something like...

np.where(df[col name].shift(1).eq(df[col name]), True, False)

I tried using scipy shift but the output isn't correct so maybe I am using it wrong?

np.where(shift(long_gt_price, 1) == (long_gt_price),"-", "Different")

Just to show you what I mean when I say it produces the incorrect output:The left column is the shift(1) and the right column is the unshifted column so the left column should equal the square diagonal up to it at least thats my understanding / what I want the False / True at 5 down on the left and 4 on the right therefore doesnt make any sense to me.

enter image description here

6 Answers 6

3

Why not use slicing

arr[1:] == arr[:-1]

Result wouls be slightly shorter array but there is no need to handle border cases.

Sign up to request clarification or add additional context in comments.

Comments

1

A simple function that shifts 1d-arrays, in a similar way to pandas:

def arr_shift(arr: np.ndarray, shift: int) -> np.ndarray:
    if shift == 0:
        return arr
    nas = np.empty(abs(shift))
    nas[:] = np.nan
    if shift > 0:
        res = arr[:-shift]
        return np.concatenate((nas,res))
    res = arr[-shift:]
    return np.concatenate((res,nas))

this is suposed to work with numerical arrays, as the shifted value is replaced by np.NAN. It is trivial to select another "null" value by just filling the nas array with whatever you want.

2 Comments

Really good function. And pretty fast. But your function converts dtype to np.float64, so if input array was np.int32, output memory consumption can be x2. I thought to make it through np.roll and filling NaNs but that requires input array dtype to be subtype of np.floating... So there is no difference in memory consumption, but your function is faster ~30-40% for big arrays (for short arrays it is 5x faster). Though np.roll can be used to shift for a given axis.
I was not aware of the type conversion, thanks for pointing that out. I'll dig a little deeper to see if there's an elegant solution to that and update the answer if needed.
0

This seems to be what you want:

shift_by = 1
arr = np.array([False, False, False, True, True, True]).tolist() ## array -> list
shift_arr = [np.nan]*shift_by + arr[:-shift_by]
np.equal(arr,shift_arr)

For purely numpy:

shift_by = 1
arr = np.array([False, False, False, True, True, True])
np.concatenate([np.array([False]*shift_by),np.equal(arr[shift_by:],arr[:-shift_by])])

2 Comments

ValueError: operands could not be broadcast together with shapes (1375,) (1374,)
Yes, it's because the example uses a list. You can get a list out of your array with a simple tolist method ala: arr = np.array([False, False, False, True, True, True]).tolist()
0

Code below can shift np.ndarray over a given axis. It should be pretty fast. But beware of using input arrays with default fill_value of dtype different from np.floating.

def np_shift(a:np.ndarray, shift_value:int, axis=0, fill_value=np.NaN) -> np.ndarray:
    if shift_value == 0:
        return a
    
    result = np.roll(a=a, shift=shift_value, axis=axis)
    axes = [slice(None)] * a.ndim
    if shift_value > 0:
        axes[axis] = slice(None, shift_value)
    else:
        axes[axis] = slice(shift_value, None)

    result[tuple(axes)] = fill_value

    return result

For example:

a = np.array([i for i in range(100000)], dtype=np.float64)
#Ok
np_shift(a, shift_value=1, axis=0, fill_value=np.NaN) 


#ValueError: cannot convert float NaN to integer
a = np.array([i for i in range(100000)], dtype=np.int32)
np_shift(a, shift_value=1, axis=0, fill_value=np.NaN)


#Ok
np_shift(a, shift_value=1, axis=0, fill_value=-15)

If you don't want to be aware of that issue, you can add checks to dtype. For example:

def np_shift(a:np.ndarray, shift_value:int, axis=0, fill_value=np.NaN) -> np.ndarray:
    if shift_value == 0:
        return a
    
    if not np.issubdtype(a.dtype, np.floating):
        a = a.astype(np.float64)
    
    result = np.roll(a=a, shift=shift_value, axis=axis)
    axes = [slice(None)] * a.ndim
    if shift_value > 0:
        axes[axis] = slice(None, shift_value)
    else:
        axes[axis] = slice(shift_value, None)

    result[tuple(axes)] = fill_value

    return result

Comments

0

You are looking for the function numpy.roll.

Example usage:

import numpy
x = numpy.arange(10)
numpy.roll(x, 2)
array([8, 9, 0, 1, 2, 3, 4, 5, 6, 7])
numpy.roll(x, -2)
array([2, 3, 4, 5, 6, 7, 8, 9, 0, 1])

There is a caveat to this. Elements which "roll off" the end of one array are re-introduced at the other end. If this isn't what you want, you will have to set these elements to zero, NaN or some other sensible default. You could also reduce the array length to remove them.

https://numpy.org/doc/stable/reference/generated/numpy.roll.html

Comments

0
import numpy as np

arr = np.array([False, False, False, True, True, True])

comparison = np.insert(arr[1:] == arr[:-1], 0, False)
print(comparison)
'''
[False  True  True False  True  True]
'''

using np.roll :

import numpy as np

arr = np.array([False, False, False, True, True, True])

comparison = arr == np.roll(arr, 1)
# Set the first element to False
comparison[0] = False  

"""
Original array: [False False False  True  True  True]
Comparison:        [False  True  True False  True  True]
"""

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.