1
FinalDf = pd.merge(HRTrainingData,HRDataSet1,HRDataSet2,HRDataSet3, on='EmployeeNumber', how='outer')

TypeError: merge() got multiple values for argument 'on'

I have only entered one 'on' argument, so I'm not sure what is going on here, but I am unable to merge these data frames. Any advice?

3
  • I believe your problem is because you are trying to merge on more than 2 dataframes at once. You might need to either use concat or just do a different merge for each dataframe Commented Jul 12, 2022 at 18:42
  • @ArchAngelPwn I actually have 15 data frames to merge into one... I tried to start with less because I cant figure out how to do 15. Commented Jul 12, 2022 at 18:44
  • @ArchAngelPwn I tried concat but my number of rows don't match my csv file because I cant merge them on a specific column. Commented Jul 12, 2022 at 18:44

2 Answers 2

3

You can write like below:

pd.merge(df3, pd.merge(df1,df2, on='EmployeeNumber', how='outer'), on='EmployeeNumber',how='outer')

Or with functools.reduce:

import functools

functools.reduce(lambda x,y : pd.merge(x,y, 
                                       on='EmployeeNumber', 
                                       how='outer'), 
                 [df1, df2, df3, df4, ..., df15])
# above code inference like below
# merge(merge(merge(merge(df1, df2), df3), df4), ..., df15)
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you, I will give that a go. I actually have 15 data frames to merge in total.
@BrandiAustin, you can use second part for 15 dataframe
ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat
@BrandiAustin, convert all columns EmployeeNumber in all dataframes to int or to str then merge them
1

Or you can use join,

df_list=[d.set_index('EmployeeNumber', inplace=True) for d in [df1,df2,d3,d4]]

df_list[0].join(df_list[1:])

One of the advantages pd.DataFrame.join over pd.DataFrame.merge, join accepts a list of dataframes.

2 Comments

KeyError: "None of ['EmployeeNumber'] are in the columns"
If employeeNumber is already in the index then remove set_index part and just create a list of your dataframes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.