I am trying to merge additional dataframes (DF_B, DF_C) onto DF_A to equal DF_D.
The only way to tie the additional dataframes to DF_A is through column B_2, so I am trying to merge them on B_2. I tried this code below to merge the first additional dataframe (DF_B).
DF_D = pd.merge(DF_A, DF_B, how='left', on='B_2')
This almost worked but it is creating additional columns.
So I thought adding left_on= might work but it did not.
DF_D = pd.merge(DF_A, DF_B, how='left', left_on=['B_2','C_3', 'D_4'])
I'm looking for a way to write additional dataframes over the main dataframe until DF_D is filled out. Also, I would like for DF_D to retain all additional rows and original columns / names even if there is no match during the merge.
Original main dataframe A:
A_1 B_2 C_3 D_4
0 03/17 3001
1 03/17 2002 L BLUE
2 03/17 3777
3 04/17 5555
4 04/17 3232
5 04/17 5000
6 04/17 5151
7 05/17 2212 S RED
Additional dataframe B:
B_2 C_3 D_4
0 3001 M GRAY
1 3131 S BLUE
2 3333 XS GREEN
3 3232 L PINK
4 3000 M RED
Used like:
DF_1 = pd.merge(DF_A, DF_B, how='left', on='B_2')
Additional dataframe C:
B_2 C_3 D_4
0 5151 S BLUE
1 5545 M PINK
2 5555 XL RED
3 5222 L GRAY
4 5112 S GREEN
Used like:
DF_D = pd.merge(DF_1, DF_C, how='left', on='B_2')
Result, final DF_D:
A_1 B_2 C_3 D_4
0 03/17 3001 M GRAY
1 03/17 2002 L BLUE
2 03/17 3777
3 04/17 5555 XL RED
4 04/17 3232 L PINK
5 04/17 5000
6 04/17 5151 S BLUE
7 05/17 2212 S RED