2

I have some excel sheets with different column as follows:

Table A: Col1 Col2 Col3

Table B: Col2 Col4 Col5

Table C: Col1 Col6 Col7

My Final Table should be like:

Table Final: Col1 Col2 Col3 Col4 Col5 Col6 Col7

Incase if there is no detail for a particular column, it should remain blank. I have successfully executed merging only two tables at a time, but I want to merge all the tables together.

This is the code that merges two sheets:

    import pandas as pd
    import numpy as np
    import glob
    df = pd.read_excel('C:/Users/Am/Downloads/sales-mar-2014.xlsx')
    status = pd.read_excel('C:/Users/Am/Downloads/customer-status.xlsx')
    all_data_st = pd.merge(df, status, how='outer') 
    all_data_st.to_excel('C:/Users/Am/Downloads/a1.xlsx',header=True)

This is a code i have written for merging more than two sheets:

    import pandas as pd
    import numpy as np
    import glob
    all_data = pd.DataFrame()
    for f in glob.glob(‘C:/Users/Am/Downloads/*.xlsx’):
    all_data = all_data.merge(pd.read_excel(f), how='outer')
    writer = pd.ExcelWriter('merged.xlsx', engine='xlsxwriter')
    all_data.to_excel(writer,sheet_name='Sheet1')
    writer.save()

This is the error i am getting:

Traceback (most recent call last):
  File "E:/allfile.py", line 7, in <module>
    all_data = all_data.merge(pd.read_excel(f), how='outer')
  File "C:\Users\Am\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\frame.py", line 6868, in merge
    copy=copy, indicator=indicator, validate=validate)
  File "C:\Users\Am\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\reshape\merge.py", line 47, in merge
    validate=validate)
  File "C:\Users\Am\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\reshape\merge.py", line 524, in __init__
    self._validate_specification()
  File "C:\Users\Am\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\reshape\merge.py", line 1033, in _validate_specification
    lidx=self.left_index, ridx=self.right_index))
pandas.errors.MergeError: No common columns to perform merge on. Merge options: left_on=None, right_on=None, left_index=False, right_index=False
4
  • 1
    Hi Amreeta, could you extend a bit more your question to include what you're obtaining right now and what you do expect to have? Commented Jun 13, 2019 at 6:48
  • 1
    you should use concat() Commented Jun 13, 2019 at 6:54
  • Hi @BorjaTur , what i want is two files to merge which the first code is doing for only two files, what the second file i have written is expected to merge more than two files but i guess i went wrong somewhere... Commented Jun 13, 2019 at 6:58
  • @anky_91 i have tried using concat() but is says the column name of all the files should be same. If you know of how to solve it, can you post a syntax of it please? Commented Jun 13, 2019 at 6:59

2 Answers 2

2

the code for two sheets is also not working, right? the argument is missing, I would recommend to save the different types of excel sheets in a new folder and then create one file for each type of excel sheet, based on the following help: Loading multiple csv files of a folder into one dataframe

then you can run the merge:

 all_data_st = pd.merge(A, B, how='outer', on='Col2')
 all_data_st = pd.merge(all_data_st, C, how='outer', on='Col1')

alternativ try to run concat:

all_data = pd.DataFrame()
for f in glob.glob(‘C:/Users/Am/Downloads/*.xlsx’):
  df = pd.read_excel(f)
  all_data = pd.concat([all_data,df], axis=0, ignore_index=True)
Sign up to request clarification or add additional context in comments.

7 Comments

Thanks for your help! The first code is executing but the second code to merge more than two files isnt..
your creation of different folders for different types of excel sheet is a nice idea but what if we need to sort 1000s of excel files?
@AmreetaKoner thats true, look at conncat then, the second code is only after you have loaded all your excel files in tree tables
the C was missing in the code, you can try to run it again
Adding C merges three excel. Incase there are 100's of excel sheet to merge, can you tell me a way out?
|
1

You can do this by below given sample code. The below given code is about to merge three .xlsx files with your stated columns. But if you are having more than three files and having known columns on which you want to merge these many tables data then you have to put this code in a function. This function should take two datasets and a merge column name as inputs and in return it gives you a merged dataset. You can iterate over list of excels files and call this function to get a final merged dataset.

Here, is the sample code:

import pandas as pd
data_A = pd.read_excel('a.xlsx')
data_B = pd.read_excel('b.xlsx')
data_C = pd.read_excel('c.xlsx')
print("File A Data:")
print(data_A)
print("File B Data:")
print(data_B)
print("File C Data:")
print(data_C)

data_AB = pd.merge(left=data_A, right=data_B, on='Col2', how='outer')
data_ABC = pd.merge(left=data_AB, right=data_C, on='Col1', how='outer')
print("Merged Data:")
print(data_ABC)

Output will be a merged data of all three tables with all columns. Hope, this may helps you to solve your problem.

1 Comment

Thanks for your help! This code is good for three excel sheets but what if we have 100's of sheet to merge into one?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.