Not sure if there is a more elegant way to do what I want to do. Basically, I need to determine the current row's "position" value based on the "factor" value and the previous row's "position" value.
I tried to loop through the DataFrame and use some if else statements to update the value, but it is very clumpy and the values didn't get updated.
Please kindly help, million thanks!
factor position
time
2022-05-13 06:00:00 0.489471 0
2022-05-13 07:00:00 0.711030 0
2022-05-13 08:00:00 0.566865 0
2022-05-13 09:00:00 0.489471 0
2022-05-13 10:00:00 0.288419 0
import pandas as pd
df = pd.DataFrame({'time': ['2022-05-13 06:00:00', '2022-05-13 07:00:00', '2022-05-13 08:00:00','2022-05-13 09:00:00', '2022-05-13 10:00:00'],
'factor': [0.489471, 0.711030, 0.566865, 0.489471, 0.288419],
'position': [0, 0, 0, 0, 0]})
df['time'] = pd.to_datetime(df['time'])
df.set_index('time', inplace=True)
threshold_2 = 0.7
threshold_1 = 0.35
for i in range(0, len(df)):
# no position
if i == 0 or df.iloc[i-1, :]['position'] == 0:
if df.iloc[i, :]['factor'] > threshold_2:
df.iloc[i, :]['position'] = 1
else:
df.iloc[i, :]['position'] = 0
#has position
elif df.iloc[i-1, :]['position'] != 0:
if df.iloc[i, :]['factor'] > threshold_1:
df.iloc[i, :]['position'] = 1
else:
df.iloc[i, :]['position'] = 0