Adding new data to a Dataframe from another Dataframe based on condition

Question

So my question here is how can I add data in new column to dataframe based on conditions from another dataframe. It is kinda difficult to say it so I am giving an example here

df1

columns  a   b  c
         0   10  1
         10  15  3
         15  20  5


df2
columns  d      e  
         3.3   10   
         5.5   20
         14.5  11
         17.2  5

What I want to do here is to add another column f to df2, and its value is from df1 such that if d[i] is between a[j] and b[j], then copy the value c[j] to the new column f[i] in df2. for example: d[1] = 5.5 so 0< 5.5< 10 hence, the value of f[1] = c[0] = 1

the final results should look like

df2
columns  d      e    f
         3.3   10    1 
         5.5   20    1
         14.5  11    3
         17.2  5     5

Any help is greatly appreciated!

Regards,

Steve

so i can be any number in range(len(df2)), j can be any number in range(len(df1)). i and j do not need to be same! — Steve Xu
– Steve Xu, Commented Jan 27, 2023 at 22:23
so let's say i == 2, then d[2] == 14.5. so the range of 14.5 falls into 10 to 15, so j == 1, c(j) == 3, therefore, f[i] =3 because c[j]= =3 — Steve Xu
– Steve Xu, Commented Jan 27, 2023 at 22:34

Chrysophylaxs · Accepted Answer · 2023-01-27 22:46:10Z

4

Assuming non-overlapping intervals in df1 a and b, you can use pd.cut with a pd.IntervalIndex:

import pandas as pd

# Your dfs here
df1 = pd.read_clipboard()
df2 = pd.read_clipboard()

idx = pd.IntervalIndex.from_arrays(df1["a"], df1["b"])
mapping = df1["c"].set_axis(idx)

df2["f"] = pd.cut(df2["d"], idx).map(mapping)

df2:

      d   e  f
0   3.3  10  1
1   5.5  20  1
2  14.5  11  3
3  17.2   5  5

answered Jan 27, 2023 at 22:46

Chrysophylaxs

6,5933 gold badges13 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

sammywemmy · Accepted Answer · 2023-01-28 03:48:42Z

2

if you do not have overlapping intervals, the pd.IntervalIndex accepted solution is a perfect fit.

Another option is with conditional_join from pyjanitor, which can also handle overlapping intervals:

# pip install pyjanitor
import pandas as pd
import janitor
(df2
.conditional_join(
    # types have to be same
    # for columns to be compared
    df1.astype({"a":float, "b":float}), 
    ('d', 'a', '>='), 
    ('d', 'b','<='), 
    # depending on the data size,
    # numba may offer more performance
    use_numba=False,
    right_columns = {'c':'f'})
)
      d   e  f
0   3.3  10  1
1   5.5  20  1
2  14.5  11  3
3  17.2   5  5

answered Jan 28, 2023 at 3:48

sammywemmy

28.9k4 gold badges21 silver badges35 bronze badges

Comments

user19077881 · Accepted Answer · 2023-01-27 22:36:25Z

1

You could use:

result = []
for item in df2['d']:
    for row in df1.iterrows():
        if row[1]['a'] <= item <= row[1]['b']:
            val = (row[1]['c'])
            break
        else:
            val = None
    result.append(val)
            
df2['f'] = result

print(df2)

answered Jan 27, 2023 at 22:36

user19077881

5,5792 gold badges8 silver badges22 bronze badges

Comments

Amber · Accepted Answer · 2023-01-27 22:40:19Z

1

import pandas as pd
df1 = pd.DataFrame({'a':[0,10,15],'b':[10,15,20],'c':[1,3,5]})
df2 = pd.DataFrame({'d':[3.3,5.5,9.5,17.2],'e':[10,20,11,5]})
df2['f']=0
for i in range(df2.shape[0]):
    for j in range(df1.shape[0]):
        if df2.d[i]>=df1.a[j] and df2.d[i]<=df1.b[j]:
            df2.f[i]=df1.c[j]
df2

answered Jan 27, 2023 at 22:40

Amber

111 bronze badge

Comments

Lorenzo Bassetti · Accepted Answer · 2023-01-27 22:49:22Z

1

What about this option ?

# merge the two dfs
df = pd.merge(df2, df1, left_on='d', right_on='b', how='left')
df2['f'] = None
df2['f'] = df.apply(lambda x: x['c'] if x['a_x'] <= x['d'] <= x['b_x'] else None, axis=1)

answered Jan 27, 2023 at 22:49

Lorenzo Bassetti

95511 silver badges16 bronze badges

Collectives™ on Stack Overflow

Adding new data to a Dataframe from another Dataframe based on condition

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related