0

i am new to programming and trying to learn python. pardon me if this sounds silly. i am trying to compare two columns in a dataframe and match the values based on the first column(used as reference). when the values in first column are not available in second or third columns, then i need to enter an NaN. could anyone helpme out how to do this? please look at the input and expected output below

Input dataframe:

index A B C
0 290 390 160
1 390 450 290
2 160 290 NaN
3 450 NaN 450

Expected Output

index A B C
0 290 290 290
1 390 390 NaN
2 160 NaN 160
3 450 450 450

1 Answer 1

1

You can do something like this

df = pd.DataFrame([[290, 390, 160],[390, 450, 290], [160, 290, np.NaN], [450, np.NaN, 450]], columns=['A', 'B', 'C'])

lis = list(df['A'])
print(lis)

Output

[290, 390, 160, 450]

Then

b = [i if i in list(df['B']) else np.nan for i in lis]
c = [i if i in list(df['C']) else np.nan for i in lis]
print(b)
print(c)

Output

[290, 390, nan, 450] #b
[290, nan, 160, 450] #c

Replace the column B,C with list b and c

df = df.assign(B=b)
df = df.assign(C=c)
index A B C
0 290 290 290
1 390 390 NaN
2 160 NaN 160
3 450 450 450
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.