python dataframe timeseries check if value changed more than x in last n rows and forward n rows

Question

I have a sunlight data coming from the field. I am checking if the sunlight changed more than a value in last 1 min and future 1 min. Below I am giving an example case. Where I am checking if the data value changed more than 4 in the last 10s. code:

xdf = pd.DataFrame({'data':np.random.randint(10,size=10)},index=pd.date_range('2022-06-03 00:00:00', '2022-06-03 00:00:45', freq='5s'))
# here data frequency 5s, so, to check last 10s
# I have to consider present row and last 2 rows
# Perform rolling max and min value for 3 rows
nrows = 3
# Allowable change
ac = 4
xdf['back_max'] = xdf['data'].rolling(nrows).max()
xdf['back_min'] = xdf['data'].rolling(nrows).min()
xdf['back_min_max_dif'] = (xdf['back_max'] - xdf['back_min'])
xdf['back_<4'] = (xdf['back_max'] - xdf['back_min']).abs().le(ac)
print(xdf)

## Again repeat the above for the future nrows
## Don't know how?

expected output:

                     data  back_max  back_min  back_min_max_dif  back_<4
2022-06-03 00:00:00     7       NaN       NaN               NaN    False
2022-06-03 00:00:05     7       NaN       NaN               NaN    False
2022-06-03 00:00:10     5       7.0       5.0               2.0     True
2022-06-03 00:00:15     8       8.0       5.0               3.0     True
2022-06-03 00:00:20     6       8.0       5.0               3.0     True
2022-06-03 00:00:25     2       8.0       2.0               6.0    False
2022-06-03 00:00:30     3       6.0       2.0               4.0     True
2022-06-03 00:00:35     1       3.0       1.0               2.0     True
2022-06-03 00:00:40     5       5.0       1.0               4.0     True
2022-06-03 00:00:45     5       5.0       1.0               4.0     True

Is there way I can simplify the above procedure? Also, I have to perform rolling max for future nrows, and how?

It's pretty much what I'd do. However, I'd skip attaching the intermediate columns to the dataframe. Also, do you need the two separate values for future and past changes? — Quang Hoang
– Quang Hoang, Commented Jun 3, 2022 at 19:26
Unrelated, there's an option of rolling on time windows as well. It might be slower than rolling by rows, however. — Quang Hoang
– Quang Hoang, Commented Jun 3, 2022 at 19:28
@QuangHoang I really appreciate your comment. I feel like there is an improvement in my coding. I don't want those columns, take unnecessary memory. I just wanted to show here and explain better. However, I have done it for the past. Question is, how to do it for future? — Mainland
– Mainland, Commented Jun 3, 2022 at 19:30
@QuangHoang also .rolling(-3) is not working. I thought it would do the forward rolling. How do we do forward rolling. — Mainland
– Mainland, Commented Jun 3, 2022 at 19:36

Quang Hoang · Accepted Answer · 2022-06-03 19:37:03Z

1

For future/forward roll, you can roll on the reversed data. This might not work with time-window roll:

rolling = xdf['data'].rolling(nrows)
xdf['pass_<'] = (rolling.max()-rolling.min()).le(ac)

future_roll = xdf['data'][::-1].rolling(nrows)
xdf['future_<'] = future_roll.max().sub(future_roll.min()).le(ac)

Output:

                     data  pass_<  future_<
2022-06-03 00:00:00     7   False      True
2022-06-03 00:00:05     7   False      True
2022-06-03 00:00:10     5    True      True
2022-06-03 00:00:15     8    True     False
2022-06-03 00:00:20     6    True      True
2022-06-03 00:00:25     2   False      True
2022-06-03 00:00:30     3    True      True
2022-06-03 00:00:35     1    True      True
2022-06-03 00:00:40     5    True     False
2022-06-03 00:00:45     5    True     False

answered Jun 3, 2022 at 19:37

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

Mainland Over a year ago

I thought about it. Really nice solution. Guess what: we can actually roll pass_result back and don't have to do the rolling max and min again for the future. Am I correct? Suppose, in your answer, look at the 2 and 3 third columns. Both are same. Rolling the 2 column 3 rows backwards give the 3 column.

Quang Hoang Over a year ago

Yes, you're totally correct, forward roll is essentially shift(-nrows).rolling(nrows).do_something().shift(). But you're gonna miss some of the beginning/end windows.

Mainland Over a year ago

Sounds good. I am doing this xdf['forward_<4'] = xdf['back_<4'][::-3] but getting mix of True and NaNs. Am I wrong?

Quang Hoang Over a year ago

do you mean xdf['back<4].shfit(-3)? ortherwise you would have the same column due to index alignment. Another thing is we can do this because min/max are direction-agnostic. Gotta go the shift route if it's cumsum, for example.

Quang Hoang Over a year ago

Oh, [::-3] meaning you reverse and skip 3 rows at a time. So yeah, you're getting mixed NaN results (2 out of 3 rows?). So it's wrong.

|

Collectives™ on Stack Overflow

python dataframe timeseries check if value changed more than x in last n rows and forward n rows

1 Answer 1

9 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

9 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related