2

I have two df columns addr_num1 and addr_num2 like below:

addr_num1  addr_num2
   10          10
   20          20
   33          35
   40          40
   50          53

I want to create a new column such that if the two values are the same, I will use one of them. If not I will combine them like below:

addr_num3
   10
   20
  33-35
   40
  50-53

How can I do this? Please advise.

2 Answers 2

3

Simple way using conditional statements:

s1 = df['addr_num1'].astype(str)
s2 = df['addr_num2'].astype(str)

import numpy as np
df['addr_num3'] = np.where(s1==s2, s1, s1+'-'+s2)

Alternative using reshaping:

df['addr_num3'] = (df[['addr_num1', 'addr_num2']]
 .astype(str)
 .reset_index()
 .melt(id_vars='index')
 .drop_duplicates(['index', 'value'])
 .groupby('index')['value'].agg('-'.join)
)

output:

   addr_num1  addr_num2 addr_num3
0         10         10        10
1         20         20        20
2         33         35     33-35
3         40         40        40
4         50         53     50-53
Sign up to request clarification or add additional context in comments.

Comments

0

You could use two steps

Set all values to your first and second column as str separated by -, this will be for non-"matches"

Use .loc to filter for matches and set the value to column one (as string for consistency)

df['addr_num3'] = df['addr_num1'].apply(str)+'-'+df['addr_num2'].apply(str)
df.loc[df['addr_num1']==df['addr_num2'],'addr_num3']=df['addr_num1'].apply(str)

loc allows to set column values based on a condition

Pandas docs on loc

Pandas docs on apply

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.