How to write a Python script that reads two excel files and outputs distinct values

Question

I'm new to programming and I'd like to know how one would approach a solution to the problem:

A python script that reads two excel files

Excel1.xlsx (has only 1 column)
Excel2.xlsx (has only 1 column)

Then the script would get the names of each excel file and create a NEW EXCEL FILE with the names of Excel1.xlsx that ARE NOT IN Excel2.xlsx

Example :

Excel1 has {"Bob , Bill , Joe, Sam, Frank"}
Excel2 has {"Bob, Joe, Sam, Frank"}

Expected output would be:

NewExcelFile {"Bill"}

Since I'm new I know how to read files but I don't know how I'd go from here:

import pandas as pd

Excel1 = pd.read_excel(Excel1.xlsx)
Excel2 = pd.read_excel(Excel2.xlsx)

You could convert the data to sets (e.g. set1 = set(Excel1), set2(Excel2)), and use the set difference to get items from Excel1 not in Excel2: unique = set1 - set2. — B Remmelzwaal
– B Remmelzwaal, Commented Mar 1, 2023 at 15:39

Hedi Mineoui · Accepted Answer · 2023-03-02 12:23:33Z

1

You can try this code below, it uses library called DeepDiff for the difference

import pandas as pd
from deepdiff import DeepDiff

Excel1 = pd.read_excel(Excel1.xlsx, header=None)
Excel2 = pd.read_excel(Excel2.xlsx, header=None)

l1 = Excel1.values.tolist()
l2 = Excel2.values.tolist()

print(DeepDiff(l1, l2))

after that you can create a excel file and append with the function result

edited Mar 2, 2023 at 12:23

answered Mar 1, 2023 at 17:45

Hedi Mineoui

465 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Azhirius Over a year ago

Thanks for the help! I've changed just a bit in some parts following your logic and it worked flawlessly!

Collectives™ on Stack Overflow

How to write a Python script that reads two excel files and outputs distinct values

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related