0

I have 2 pandas dataframes df_x and df_y and I want to update a ‘SCORE’ column that they both have in common. I want to update the score column with a score of 100 if the following conditions are met:

  • Age <= 45 AND column Banned != 1 OR column chargeback !=1

My trial below did not work. Any inputs?

if df_x.AGE_DAYS <= 45 and (df_x.BANNED != 1 or df_x.CHARGEBACK != 1):
    df_x['SCORE'] = 100
 
if df_y.AGE_DAYS <= 45 and (df_y.BANNED != 1 or df_y.CHARGEBACK != 1):
    df_y['SCORE'] = 100

Outcome: Basically, the 'Score' column should update the existing value with a 100 or do nothing at all if the above described criteria are met.

To set up a sample testing enviornment:

import pandas as pd
df = pd.DataFrame([[1, 1, 0], [100, 0,0], [46, 1, 0]], columns=['AGE_DAYS', 'Banned', 'Chargeback'])

print(df)

Desired output: Updated score column added to show that score values outside the criteria specified not changed. Only values changed if they meet the criteria of this search!

AGE BANNED  CHARGEBACK  SCORE   "UPDATED SCORE"
45    1        0         75      75
33    0        0         45      **100**
44    0        0         77      **100**
235   0        1         75      75
43    1        0         88      88
21    0        0         23      **100**
1     0        0         56      **100**
432   1        1         12      12
3
  • 1
    You are expected to add a sample of input data and your expected output as text, You can look at How to make good reproducible pandas examples Commented Jun 18, 2021 at 23:57
  • how are you getting 100 in updated score bcz there Age <= 45 doesn't satisfying? Commented Jun 19, 2021 at 3:05
  • 1
    Opps Good Catch! Updated!!! @AnuragDabas Commented Jun 19, 2021 at 3:06

2 Answers 2

1

Try:

c=(df['AGE']<=45) & df[['BANNED','CHARGEBACK']].ne(1).all(1)
#OR(both conditions are same so choose any one)
c=(((df['BANNED'].ne(1)) & (df['CHARGEBACK'].ne(1)))) & (df['AGE']<=45)
#your condition
#also notice that you need & in place of | in your condition that you posted in question

Finally use mask() method:

df['UPDATED SCORE']=df['SCORE'].mask(c,100)

OR

you can also use numpy's where() method for this:

#import numpy as np
df['UPDATED SCORE']=np.where(c,100,df['SCORE'])
Sign up to request clarification or add additional context in comments.

Comments

0

you can try using apply

df_x['SCORE'] = df_x.apply(lambda x:100 if x['AGE_DAYS'] <= 45 and (x['BANNED'] != 1 or x['CHARGEBACK'] != 1)

df_y['SCORE'] = df_y.apply(lambda x: 100 if x['AGE_DAYS'] <= 45 and (x['BANNED'] != 1 or x['CHARGEBACK'] != 1)

2 Comments

I seem to get a syntax error here. I tried to debug but idk why it would return "an invalid syntax starting at df_y portion"
hmm strange can you paste the traceback?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.