1

How do i make my df.to_excel function write to an output path? After my script runs, I do not see the files in the output_path directory i have defined.

import pandas as pd
from openpyxl import load_workbook
import os
import datetime

output_path = 'C:/Users/g/Desktop/autotranscribe/python/Processed'

path = 'C:/Users/g/Desktop/autotranscribe/python/Matching'

cols_to_drop =  ['PSI ID','PSIvet Region','PSIvet region num','Fax','County']               
 
column_name_update_map = {'Account name': 'Company Name','Billing address':'Address','Billing city':'City','Billing State':'State'} 

for file in os.listdir("C:/Users/g/Desktop/autotranscribe/python/Matching"):
    if file.startswith("PSI") and "(updated headers)" not in file:
        dfs = pd.read_excel(file, sheet_name=None,skiprows=5)
        output = dict()
        for ws, df in dfs.items():
            if ws.startswith("Cancelled Members"): df = df.drop('Active date', axis=1)
            if any(ws.startswith(x) for x in ["New Members","PVCC"]):
                continue
            #if ws in ["New Members 03.22","PVCC"]:   #sheetstoavoid
            temp = df
            dt = pd.to_datetime(os.path.getctime(os.path.join(path,file)),unit="s").replace(nanosecond=0)
            output[ws] = temp
        writer = pd.ExcelWriter(f'{file.replace(".xlsx","")} (updated headers).xlsx')
        for ws, df in output.items():
            df.to_excel(writer, index=None, sheet_name=ws)
        writer.save()
        writer.close()

I tried df.to_excel(writer,output_path, index=None, sheet_name=ws)

But i get an error File "", line 36, in df.to_excel(writer,output_path, index=None, sheet_name=ws)

TypeError: to_excel() got multiple values for argument 'sheet_name'.

1 Answer 1

0

A few comments:

  • The function os.listdir() only returns "unqualified" file names, so before using file, we need to prepend path using something like input_file_name = f'{path}/{file}'.
  • Similarly, pd.ExcelWriter() will need a qualified file name (that is, including the path as well as the "unqualified" file name), which we can get by doing this: output_file_name = f'{output_path}/{file.replace(".xlsx","")} (updated headers).xlsx'.
  • There are some elements of the code in your question that may not be getting used, but rather than comment on or change those, I provide a working version with minimal changes below.

I created directories named Matching and Processed. I placed a file named PSI 123.xlsx in Matching with a tab named Cancelled Members containing the following:

will skip           
will skip           
will skip           
will skip           
will skip           
Col1        Col2    Col3    Active date
xx          NY      110     789

I then ran the following modification to your code (note the changes to output_path and path for testing purposes in my environment):

import pandas as pd
from openpyxl import load_workbook
import os
import datetime

#output_path = 'C:/Users/g/Desktop/autotranscribe/python/Processed'

#path = 'C:/Users/g/Desktop/autotranscribe/python/Matching'

output_path = './Processed'
path = './Matching'

cols_to_drop =  ['PSI ID','PSIvet Region','PSIvet region num','Fax','County']               
 
column_name_update_map = {'Account name': 'Company Name','Billing address':'Address','Billing city':'City','Billing State':'State'} 

for file in os.listdir(path):
    if file.startswith("PSI") and "(updated headers)" not in file:
        input_file_name = f'{path}/{file}'
        dfs = pd.read_excel(input_file_name, sheet_name=None,skiprows=5)
        output = dict()
        for ws, df in dfs.items():
            if ws.startswith("Cancelled Members") and 'Active date' in df.columns: df = df.drop('Active date', axis=1)
            if any(ws.startswith(x) for x in ["New Members","PVCC"]):
                continue
            #if ws in ["New Members 03.22","PVCC"]:   #sheetstoavoid
            temp = df
            dt = pd.to_datetime(os.path.getctime(os.path.join(path,file)),unit="s").replace(nanosecond=0)
            output[ws] = temp
        output_file_name = f'{output_path}/{file.replace(".xlsx","")} (updated headers).xlsx'
        writer = pd.ExcelWriter(output_file_name)
        for ws, df in output.items():
            df.to_excel(writer, index=None, sheet_name=ws)
        writer.save()
        writer.close()

After running, the code had created a new file in Processed named PSI 123 (updated headers).xlsx with sheets named as in the input. The sheet Cancelled Members contained the following:

Address State   Zip Status  Status.1    Date    Partner
Col1    Col2    Col3
xx      NY      110
Sign up to request clarification or add additional context in comments.

2 Comments

Works Perfectly. I added one change from our previous thread - output_file_name = f'{output_path}/{file[:3]}_Processed(updated headers).xlsx'
constantstranger sorry for taggin you again. not sure why no one has answered this question of mine today. stackoverflow.com/questions/72191076/… Please let me know if this is something you have insight in

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.