0

I have two dataframes that I want to sum along the y axis, conditionally.

For example:

df_1

a    b    value
1    1    1011
1    2    1012
2    1    1021
2    2    1022

df_2

a    b    value
9    9    99
1    2    12
2    1    21

I want to make df_1['value'] -= df_2['value'] if df_1[a] == df_2[a] & df_1[b] == df_2[b], so the output would be:

OUTPUT

a    b    value
1    1    1011
1    2    1000
2    1    1000
2    2    1022

Is there a way to achieve that instead of iterating the whole dataframe? (It's pretty big)

3 Answers 3

5

Make use of index alignment that pandas provides here, by setting a and b as your index before subtracting.


for df in [df1, df2]:
    df.set_index(['a', 'b'], inplace=True)

df1.sub(df2, fill_value=0).reindex(df1.index)

      value
a b
1 1  1011.0
  2  1000.0
2 1  1000.0
  2  1022.0
Sign up to request clarification or add additional context in comments.

1 Comment

As I need the original structure, I ended up with: df1 = df1.sub(df2, fill_value=0).reindex(df1.index).reset_index(). Thank you all.
3

You could also perform a left join and subtract matching values. Here is how to do that:

(pd.merge(df_1, df_2, how='left', on=['a', 'b'], suffixes=('_1', '_2'))
 .fillna(0)
 .assign(value=lambda x: x.value_1 - x.value_2)
)[['a', 'b', 'value']]

Comments

1

You could let

merged = pd.merge(df_1, df_2, on=['a', 'b'], left_index=True)
df_1.value[merged.index] = merged.value_x - merged.value_y

Result:

In [37]: df_1
Out[37]:
   a  b  value
0  1  1   1011
1  1  2   1000
2  2  1   1000
3  2  2   1022

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.