1

Not sure if there is a more elegant way to do what I want to do. Basically, I need to determine the current row's "position" value based on the "factor" value and the previous row's "position" value.

I tried to loop through the DataFrame and use some if else statements to update the value, but it is very clumpy and the values didn't get updated.

Please kindly help, million thanks!

                       factor  position
time                                   
2022-05-13 06:00:00  0.489471         0
2022-05-13 07:00:00  0.711030         0
2022-05-13 08:00:00  0.566865         0
2022-05-13 09:00:00  0.489471         0
2022-05-13 10:00:00  0.288419         0
import pandas as pd

df = pd.DataFrame({'time': ['2022-05-13 06:00:00', '2022-05-13 07:00:00', '2022-05-13 08:00:00','2022-05-13 09:00:00', '2022-05-13 10:00:00'],
                   'factor': [0.489471, 0.711030, 0.566865, 0.489471, 0.288419],
                   'position': [0, 0, 0, 0, 0]})
df['time'] = pd.to_datetime(df['time'])
df.set_index('time', inplace=True)

threshold_2 = 0.7
threshold_1 = 0.35

for i in range(0, len(df)):
    # no position
    if i == 0 or df.iloc[i-1, :]['position'] == 0:
        if df.iloc[i, :]['factor'] > threshold_2:
            df.iloc[i, :]['position'] = 1
        else:
            df.iloc[i, :]['position'] = 0

    #has position
    elif df.iloc[i-1, :]['position'] != 0:
        if df.iloc[i, :]['factor'] > threshold_1:
            df.iloc[i, :]['position'] = 1
        else:
            df.iloc[i, :]['position'] = 0

1
  • can you explain the logic and provide the expected output? Commented Nov 5, 2022 at 7:10

2 Answers 2

2

IIUC, you can use this vectorial alternative:

threshold_2 = 0.7
threshold_1 = 0.35

m1 = df['factor'].gt(threshold_2)

group = m1.cumsum()

m2 = df.loc[group>0, 'factor'].gt(threshold_1).groupby(group).cummin()

df['position'] = (m1|df.index.isin(m2[m2].index)).astype(int)

output:

                       factor  position
time                                   
2022-05-13 06:00:00  0.489471         0
2022-05-13 07:00:00  0.711030         1
2022-05-13 08:00:00  0.566865         1
2022-05-13 09:00:00  0.489471         1
2022-05-13 10:00:00  0.288419         0
Sign up to request clarification or add additional context in comments.

4 Comments

little offtopic. Would you mind having a quick look on my answer here ( the updated part) please? Task is: "groupby CUI, find row where value equals a value in one column and pick value of another column. Set whole group to that value." Would you do it the same way I did there?
@Rabinzel I would rather mask the non-MSH values and bfill
ok, thanks! but that only work in that case, because MSH values are both the last of its group right ?
Yes, otherwise masking and groupby.transform('first'), or your approach
1

When you use chained indexing, the order and type of the indexing operation partially determine whether the result is a slice into the original object, or a copy of the slice.

Read more about it here

I made some little changes to your code, so that it is working, e.g you don't need the inner if/else, you can just write 1 if foo else 0.

df = pd.DataFrame({'time': ['2022-05-13 06:00:00', '2022-05-13 07:00:00', '2022-05-13 08:00:00','2022-05-13 09:00:00', '2022-05-13 10:00:00'],
                   'factor': [0.489471, 0.711030, 0.566865, 0.489471, 0.288419],
                   'position': [0, 0, 0, 0, 0]})
df['time'] = pd.to_datetime(df['time'])
df.set_index('time', inplace=True)
threshold_2 = 0.7
threshold_1 = 0.35

for i in range(0, len(df)):
    # no position
    if i == 0 or df.loc[df.index[i-1], 'position'] == 0:
        df.loc[df.index[i], 'position'] = 1 if df.loc[df.index[i], 'factor'] > threshold_2 else 0
    #has position
    elif df.loc[df.index[i-1], 'position'] != 0:
        df.loc[df.index[i], 'position'] = 1 if df.loc[df.index[i], 'factor'] > threshold_1 else 0

print(df)
                       factor  position
time                                   
2022-05-13 06:00:00  0.489471         0
2022-05-13 07:00:00  0.711030         1
2022-05-13 08:00:00  0.566865         1
2022-05-13 09:00:00  0.489471         1
2022-05-13 10:00:00  0.288419         0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.