numpy array equivalent of pandas.shift() function?

Question

I have an array

    [False False False ...  True  True  True]

I want to check if the previous value == current value. In pandas, I can use something like...

np.where(df[col name].shift(1).eq(df[col name]), True, False)

I tried using scipy shift but the output isn't correct so maybe I am using it wrong?

np.where(shift(long_gt_price, 1) == (long_gt_price),"-", "Different")

Just to show you what I mean when I say it produces the incorrect output:The left column is the shift(1) and the right column is the unshifted column so the left column should equal the square diagonal up to it at least thats my understanding / what I want the False / True at 5 down on the left and 4 on the right therefore doesnt make any sense to me.

tstanisl · Accepted Answer · 2020-06-14 23:05:40Z

3

Why not use slicing

arr[1:] == arr[:-1]

Result wouls be slightly shorter array but there is no need to handle border cases.

answered Jun 14, 2020 at 23:05

tstanisl

14.3k3 gold badges31 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Caio Castro · Accepted Answer · 2022-06-25 00:35:25Z

1

A simple function that shifts 1d-arrays, in a similar way to pandas:

def arr_shift(arr: np.ndarray, shift: int) -> np.ndarray:
    if shift == 0:
        return arr
    nas = np.empty(abs(shift))
    nas[:] = np.nan
    if shift > 0:
        res = arr[:-shift]
        return np.concatenate((nas,res))
    res = arr[-shift:]
    return np.concatenate((res,nas))

this is suposed to work with numerical arrays, as the shifted value is replaced by np.NAN. It is trivial to select another "null" value by just filling the nas array with whatever you want.

answered Jun 25, 2022 at 0:35

Caio Castro

6036 silver badges16 bronze badges

2 Comments

Eugene Over a year ago

Really good function. And pretty fast. But your function converts dtype to np.float64, so if input array was np.int32, output memory consumption can be x2. I thought to make it through np.roll and filling NaNs but that requires input array dtype to be subtype of np.floating... So there is no difference in memory consumption, but your function is faster ~30-40% for big arrays (for short arrays it is 5x faster). Though np.roll can be used to shift for a given axis.

Caio Castro Over a year ago

I was not aware of the type conversion, thanks for pointing that out. I'll dig a little deeper to see if there's an elegant solution to that and update the answer if needed.

Partha Mandal · Accepted Answer · 2020-06-15 00:16:11Z

0

This seems to be what you want:

shift_by = 1
arr = np.array([False, False, False, True, True, True]).tolist() ## array -> list
shift_arr = [np.nan]*shift_by + arr[:-shift_by]
np.equal(arr,shift_arr)

For purely numpy:

shift_by = 1
arr = np.array([False, False, False, True, True, True])
np.concatenate([np.array([False]*shift_by),np.equal(arr[shift_by:],arr[:-shift_by])])

edited Jun 15, 2020 at 0:16

answered Jun 14, 2020 at 20:57

Partha Mandal

1,45111 silver badges16 bronze badges

2 Comments

JPWilson Over a year ago

ValueError: operands could not be broadcast together with shapes (1375,) (1374,)

Partha Mandal Over a year ago

Yes, it's because the example uses a list. You can get a list out of your array with a simple tolist method ala: arr = np.array([False, False, False, True, True, True]).tolist()

Eugene · Accepted Answer · 2023-11-16 15:17:32Z

Code below can shift np.ndarray over a given axis. It should be pretty fast. But beware of using input arrays with default fill_value of dtype different from np.floating.

def np_shift(a:np.ndarray, shift_value:int, axis=0, fill_value=np.NaN) -> np.ndarray:
    if shift_value == 0:
        return a
    
    result = np.roll(a=a, shift=shift_value, axis=axis)
    axes = [slice(None)] * a.ndim
    if shift_value > 0:
        axes[axis] = slice(None, shift_value)
    else:
        axes[axis] = slice(shift_value, None)

    result[tuple(axes)] = fill_value

    return result

For example:

a = np.array([i for i in range(100000)], dtype=np.float64)
#Ok
np_shift(a, shift_value=1, axis=0, fill_value=np.NaN) 


#ValueError: cannot convert float NaN to integer
a = np.array([i for i in range(100000)], dtype=np.int32)
np_shift(a, shift_value=1, axis=0, fill_value=np.NaN)


#Ok
np_shift(a, shift_value=1, axis=0, fill_value=-15)

If you don't want to be aware of that issue, you can add checks to dtype. For example:

def np_shift(a:np.ndarray, shift_value:int, axis=0, fill_value=np.NaN) -> np.ndarray:
    if shift_value == 0:
        return a
    
    if not np.issubdtype(a.dtype, np.floating):
        a = a.astype(np.float64)
    
    result = np.roll(a=a, shift=shift_value, axis=axis)
    axes = [slice(None)] * a.ndim
    if shift_value > 0:
        axes[axis] = slice(None, shift_value)
    else:
        axes[axis] = slice(shift_value, None)

    result[tuple(axes)] = fill_value

    return result

user2138149 · Accepted Answer · 2024-08-29 16:10:45Z

0

You are looking for the function numpy.roll.

Example usage:

import numpy
x = numpy.arange(10)
numpy.roll(x, 2)
array([8, 9, 0, 1, 2, 3, 4, 5, 6, 7])
numpy.roll(x, -2)
array([2, 3, 4, 5, 6, 7, 8, 9, 0, 1])

There is a caveat to this. Elements which "roll off" the end of one array are re-introduced at the other end. If this isn't what you want, you will have to set these elements to zero, NaN or some other sensible default. You could also reduce the array length to remove them.

https://numpy.org/doc/stable/reference/generated/numpy.roll.html

answered Aug 29, 2024 at 16:10

user2138149

18.6k32 gold badges160 silver badges321 bronze badges

Comments

Soudipta Dutta · Accepted Answer · 2024-09-23 17:42:34Z

0

import numpy as np

arr = np.array([False, False, False, True, True, True])

comparison = np.insert(arr[1:] == arr[:-1], 0, False)
print(comparison)
'''
[False  True  True False  True  True]
'''

using np.roll :

import numpy as np

arr = np.array([False, False, False, True, True, True])

comparison = arr == np.roll(arr, 1)
# Set the first element to False
comparison[0] = False  

"""
Original array: [False False False  True  True  True]
Comparison:        [False  True  True False  True  True]
"""

edited Sep 23, 2024 at 17:42

answered Sep 23, 2024 at 17:36

Soudipta Dutta

2,0721 gold badge16 silver badges11 bronze badges

Collectives™ on Stack Overflow

numpy array equivalent of pandas.shift() function?

6 Answers 6

Comments

2 Comments

2 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Comments

2 Comments

2 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related