0

I have one excel file that has multiple sheets that share some columns and rows (all numeric data). I found this good question but that not really answer my question, Import multiple excel sheets from different files into python and concatenate them into one dataframe

I am looking for a way to loop through all the sheets located in one single excel file and combine them into one table by summing the shared columns and rows.

For example, one sheet:

  Subject   Modifier    Tech_A   Chart_A  Unknown_B  Tech_B
    A         Tech_A      1        4         1         2
    A         Chart_A     3        3         2         4
    B         Unknown_B   4        2         4         5
    B         Tech_B      5        1         3         3

Another sheet,

  Subject   Modifier    Tech_A  Chart_A  Unknown_A Unknown_B  Tech_B  Chart_B
    A         Tech_A      1        4         1         2        1       1
    A         Chart_A     3        3         2         4        3       2
    A         Unknown_A   1        2         5         3        3       4
    B         Unknown_B   4        2         4         5        5       3
    B         Tech_B      5        1         3         3        2       5
    B         Chart_B     1        1         2         4        2       0

The combined single file will merge all the data and sum the similar ones,

  Subject   Modifier    Tech_A  Chart_A  Unknown_A Unknown_B  Tech_B  Chart_B
    A         Tech_A      2        8         1         3        3       1
    A         Chart_A     6        6         2         6        7       2
    A         Unknown_A   1        2         5         3        3       4
    B         Unknown_B   8        4         4         9        10      3
    B         Tech_B      10       2         3         6        6       5
    B         Chart_B     1        1         2         4        2       0

How can I achieve this in python?

Thanks in advance

1 Answer 1

1

try something like this:

df = pd.concat(pd.read_excel('2018_Sales_Total.xlsx', sheet_name=None), ignore_index=True)
df = df.groupby('Modifier').sum()

According to the documentation, specifing sheet_name to 'None' you get all sheets.

Sign up to request clarification or add additional context in comments.

3 Comments

This will put everything together. The data is huge and this will put everything there. Is there a way to loop through the sheets and sum while adding them?
i've edited my answer. By executing groupby and summarizing the values you should get to what you desire.
This calculated the sum. But for somehow I got duplicate columns and those duplicate has different values. Thank you for help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.