0

I'm new to programming and I'd like to know how one would approach a solution to the problem:

A python script that reads two excel files

  • Excel1.xlsx (has only 1 column)
  • Excel2.xlsx (has only 1 column)

Then the script would get the names of each excel file and create a NEW EXCEL FILE with the names of Excel1.xlsx that ARE NOT IN Excel2.xlsx

Example :

Excel1 has {"Bob , Bill , Joe, Sam, Frank"}
Excel2 has {"Bob, Joe, Sam, Frank"}

Expected output would be:

NewExcelFile {"Bill"}

Since I'm new I know how to read files but I don't know how I'd go from here:

import pandas as pd

Excel1 = pd.read_excel(Excel1.xlsx)
Excel2 = pd.read_excel(Excel2.xlsx)
1
  • You could convert the data to sets (e.g. set1 = set(Excel1), set2(Excel2)), and use the set difference to get items from Excel1 not in Excel2: unique = set1 - set2. Commented Mar 1, 2023 at 15:39

1 Answer 1

1

You can try this code below, it uses library called DeepDiff for the difference

import pandas as pd
from deepdiff import DeepDiff

Excel1 = pd.read_excel(Excel1.xlsx, header=None)
Excel2 = pd.read_excel(Excel2.xlsx, header=None)

l1 = Excel1.values.tolist()
l2 = Excel2.values.tolist()

print(DeepDiff(l1, l2))

after that you can create a excel file and append with the function result

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the help! I've changed just a bit in some parts following your logic and it worked flawlessly!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.