35

I have some complicated formating saved in a template file into which I need to save data from a pandas dataframe. Problem is when I use pd.to_excel to save to this worksheet, pandas overwrites the formatting. Is there a way to somehow 'paste values' form the df into the worksheet? I am using pandas 0.17

import openpyxl
import pandas as pd
wb= openpyxl.load_workbook('H:/template.xlsx')
sheet = wb.get_sheet_by_name('spam')
sheet.title = 'df data'
wb.save('H:/df_out.xlsx')

xlr = pd.ExcelWriter('df_out.xlsx')
df.to_excel(xlr, 'df data')
xlr.save()

5 Answers 5

44

openpyxl 2.4 comes with a utility for converting Pandas Dataframes into something that openpyxl can work with directly. Code would look a bit like this:

from openpyxl.utils.dataframe import dataframe_to_rows
rows = dataframe_to_rows(df)

for r_idx, row in enumerate(rows, 1):
    for c_idx, value in enumerate(row, 1):
         ws.cell(row=r_idx, column=c_idx, value=value)

You can adjust the start of the enumeration to place the cells where you need them.

See openpyxl documentation for more information.

Sign up to request clarification or add additional context in comments.

5 Comments

May be we shall have a function like DataFrame to sheet?
@Abbas I don't think that is necessary at all. Once 2.4 is released I will work with Pandas to make use of this in the df.to_excel() method.
Will this allow me to choose a different row from df to an arbitrary row in the file and repeat it for all rows ?
This is the long lost answer to how to overwrite the data of an existing sheet using pandas and openpyxl! I added: rows = dataframe_to_rows(df, index=False, header=True)
@CharlieClark I don't think that is necessary at all. Once 2.4 is released I will work with Pandas to make use of this in the df.to_excel() method. => Any progress on the support of openpyxl.Workbook in df_to_excel?
13

I slightly modified @CharlieClark's great answer to avoid the index (which is not there in the original Excel file). Here is a ready-to-run code:

import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl import load_workbook
wb = load_workbook('test.xlsx')  # load as openpyxl workbook; useful to keep the original layout
                                 # which is discarded in the following dataframe
df = pd.read_excel('test.xlsx')  # load as dataframe (modifications will be easier with pandas API!)
ws = wb.active
df.iloc[1, 1] = 'hello world'    # modify a few things
rows = dataframe_to_rows(df, index=False)
for r_idx, row in enumerate(rows, 1):
    for c_idx, value in enumerate(row, 1):
        ws.cell(row=r_idx, column=c_idx, value=value)
wb.save('test2.xlsx')

6 Comments

dataframe_to_rows(df, index=False, header=True) if you've headers
A bit late to the party but this code snippet is not working unfortunately. Workbook object neither has an attribute 'active' nor 'cell'. It only has the attribute 'active_cell' which is not a built-in method, hence not callable. Kindly correct me if I'm wrong or suggest a workaround
@SoumyaC This code worked as of March 2021, I confirm I used it everal times. Maybe things have changed with the new versions of Pandas/OpenPyxl?
@Basj Please let me know if there has been some changes, if convenient. Also does this solution work for macros? Because my use-case involves editing the underlying macros and writing them back to the Excel template
@Basj My apologies for the confusion. The API is still the same, I had already sliced it by sheetname and tried to access the cell() method. Had to do merging and unmerging the cells to achieve my desired output
|
2

I extended and Charlie's answer and extracted it into a function which imitates the signature of DataFrame.to_excel:

from openpyxl.utils.dataframe import dataframe_to_rows

def df_to_excel(df, ws, header=True, index=True, startrow=0, startcol=0):
    """Write DataFrame df to openpyxl worksheet ws"""

    rows = dataframe_to_rows(df, header=header, index=index)

    for r_idx, row in enumerate(rows, startrow + 1):
        for c_idx, value in enumerate(row, startcol + 1):
             ws.cell(row=r_idx, column=c_idx).value = value

Example use, note that openpyxl puts the index name on a second line below the actual index, which is different behavior compared to DataFrame.to_excel:

import pandas as pd
import openpyxl
import os

wb = openpyxl.Workbook()
df = pd.DataFrame([[1, 2], [3, 4]], columns=["A", "B"]).rename_axis("Index")
df_to_excel(df, wb.active)
wb.save("out.xlsx")
os.startfile("out.xlsx")  # open the file in Excel (only works on Windows)

Comments

1

Here is the solution for you using clipboard:

import openpyxl
import pandas as pd
import clipboard as clp

#Copy dataframe to clipboard
df.to_clipboard()
#paste the clipboard to a valirable
cells = clp.paste()
#split text in varialble as rows and columns
cells = [x.split() for x in cells.split('\n')]

#Open the work book
wb= openpyxl.load_workbook('H:/template.xlsx')
#Get the Sheet
sheet = wb.get_sheet_by_name('spam')
sheet.title = 'df data'
#Paste clipboard values to the sheet
for i, r in zip(range(1,len(cells)), cells):
    for j, c in zip(range(1,len(r)), r):
        sheet.cell(row = i, column = j).value = c
#Save the workbook
wb.save('H:/df_out.xlsx')

3 Comments

This creates two intermediate data structures: the clipboard and cells.
I was looking for something like paste clipboard in openpyxl, similar to the functions in pandas.
There will be the ws.values property which we will be an easy way to get at the values of a worksheet but this will not be writable. ws.iter_cols() will provide an columnar interface for editable worksheets.
0

you shoud to get your data shap first to determine the range of loop

wb_formats=load_workbook("template.xlsx")            
ws_index=wb_formats.get_sheet_by_name("index")
daily_input= pd.read_excel(self.readfile,"input")
list_item=data_analysis1.groupby(["item_id"])["product_name"].unique()
list_item_size=pd.DataFrame(list_item,columns=["product_name"]).shape[0]

#create  the index sheet:
            r = 2  # start at 4th row
            c = 1 # column 'a'
            for row in range(0,list_item_size):  
                rows = list_item.iloc[row]
                for item in rows:
                    ws_index.cell(row=r, column=c).value = item
                    c += 1 # Column 'd'
                c = 1
                r += 1
wb_formats.save(save_name)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.