0

I am converting an SQL code, that has a join function, to Python.

This is the SQL code:

INSERT INTO tropical_fruits
SELECT DISTINCT A.* 
FROM fruits A LEFT OUTER JOIN tropical_fruits B 
ON A.[fruit1] = B.[fruit1] AND A.[fruit2] = B.[fruit2];

Python conversion code is:

data = pd.merge(
    fruits, tropical_fruits,
    left_on=['fruit1','fruit2'], 
    right_on=['fruit1','fruit2'], 
    how='left'
)

But I haven't got the desired result. Is the Python code correct?

7
  • How did your Python try differ from your SQL try? Commented Mar 2, 2024 at 15:11
  • Python code is creating a new column with _x and _y suffix of these two columns 'fruit1','fruit2'. Commented Mar 2, 2024 at 15:18
  • 1
    out = fruits.merge(tropical_fruits, on=['fruit1','fruit2'], how='left') Commented Mar 2, 2024 at 15:22
  • It's still creating _x and _y suffix of the 2 columns Commented Mar 2, 2024 at 15:34
  • 5
    read this post : pandas-merging-101 and If this is not enough, please read the following notice and provide minimal reproducible example Commented Mar 2, 2024 at 15:42

2 Answers 2

1

Try this:

data = fruits.merge(tropical_fruits, on=['fruit1','fruit2'], how='left')
Sign up to request clarification or add additional context in comments.

Comments

1

if you want to avoid _x and _y suffix you should determine the columns that you want in merge for each dataframe.

import pandas as pd
fruits = pd.DataFrame({'fruit1':['apple', 'banana', 'cherry'],'fruit2':['d','f','h'],'id':[1,2,3]  })
tropical_fruits = pd.DataFrame({'fruit1':['apple', 'banana', 'mango'],'fruit2':['d','n','h'],'id':[1,2,4]})
result=pd.merge(fruits[['fruit1','fruit2','id']], tropical_fruits[['fruit1','fruit2']],  how='left', left_on=['fruit1','fruit2'],right_on=['fruit1','fruit2'])

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.