How to change column format in excel with python

Question

I want to copy an specific rows and columns one sheet to another with "openpyxl" method. But my main excel file is .xlsb file and "openpyxl" is not supporting .xlsb file.So I build that complicated way. (* I can not change the .xlsb from Microsoft Excel according to company rules).

main document.xlsb file->temporary document.xlsx->my analyse document.xlsx

-Firstly, I' m changing the data format .xlsb to .xlsx with pandas.

-After that, from temporary document.xlsx, I' m taking specific columns and rows with openpyxl method and paste to my analyse document.xlsx

-My question is: I want to change D column format from "general" to "short date" and I am beginner at Python. Would you please help me about codes? Also if I can change the format cell in ".xlsb to .xlsx tranformation period" maybe I can take input from user: "which date do you want to append 'my analyse document.xlsx?'"

'main document.xlsx'

'temporary document.xlsx'

'my analyse document.xlsx'

import pandas as pd
import openpyxl

df = pd.read_excel("main document.xlsb",sheet_name="Data", engine="pyxlsb")
df.to_excel("temporary document.xlsx")

#! Python 3
# - Copy and Paste Ranges using OpenPyXl library

# Prepare the spreadsheets to copy from and paste too.

# File to be copied
wb = openpyxl.load_workbook("temporary document.xlsx")  # Add file name
sheet = wb["Sheet1"]  # Add Sheet name

# File to be pasted into
template = openpyxl.load_workbook("my analyse document.xlsx")  # Add file name
temp_sheet = template["Sheet2"]  # Add Sheet name


# Copy range of cells as a nested list
# Takes: start cell, end cell, and sheet you want to copy from.
def copyRange(startCol, startRow, endCol, endRow, sheet):
    rangeSelected = []
    # Loops through selected Rows
    for i in range(startRow, endRow + 1, 1):
        # Appends the row to a RowSelected list
        rowSelected = []
        for j in range(startCol, endCol + 1, 1):
            rowSelected.append(sheet.cell(row=i, column=j).value)
        # Adds the RowSelected List and nests inside the rangeSelected
        rangeSelected.append(rowSelected)

    return rangeSelected


# Paste range
# Paste data from copyRange into template sheet
def pasteRange(startCol, startRow, endCol, endRow, sheetReceiving, copiedData):
    countRow = 0
    for i in range(startRow, endRow + 1, 1):
        countCol = 0
        for j in range(startCol, endCol + 1, 1):
            sheetReceiving.cell(row=i, column=j).value = copiedData[countRow][countCol]
            countCol += 1
        countRow += 1


def createData():
    print("Processing...")
    selectedRange = copyRange(2, 2011, 183, 2274, sheet)  # Change the 4 number values (startCol, startRow, endCol, endRow, sheet)
    pastingRange = pasteRange(2, 4573, 182, 4836, temp_sheet, selectedRange)  # Change the 4 number values (startCol, startRow, endCol, endRow, sheet)
    # You can save the template as another file to create a new file here too.s
    template.save("my analyse document.xlsx")
    print("Range copied and pasted!")

go= createData()

I tried zypro and manojk's solutions. I will edit the question specificly. — piseynir
– piseynir, Commented Mar 30, 2020 at 14:55

zypro · Accepted Answer · 2020-03-30 15:18:04Z

3

Yeah have a look here in the docs:

import xlsxwriter

workbook = xlsxwriter.Workbook('your_file.xlsx')
worksheet = workbook.add_worksheet()

cell_format05 = workbook.add_format()
cell_format05.set_num_format('mm/dd/yy')
worksheet.write(5, 0, 36892.521, cell_format05)  # output -> 01/01/01

# untested code for you, get your cells into column_D8_downwards array
# this is rather a structural code, not functional!
row = 0
for cell in column_D8_downwards:
    worksheet.write(row, 'D8', cell, cell_format_05)
    row=+1

workbook.close()

So iterate over all cells in your column (D8 downwards) and write the old value into the cell with the new format.

Here is a good example.

edited Mar 30, 2020 at 15:18

answered Mar 30, 2020 at 14:03

zypro

1,1833 gold badges13 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

piseynir Over a year ago

Thanks for reply. I am new at python, according to that, I can not apply the codes very well. The codes which you mention is not worked. This error occured:" NameError: name 'cell_format05' is not defined"

zypro Over a year ago

yeah, it was just copied out of the examples in the docs. You get a format if you add one to the workbook.

ManojK · Accepted Answer · 2020-03-30 16:31:12Z

0

You can also do the same after reading the data with pandas.DataFrame with help of xlrd:

import xlrd
import pandas as pd

df = pd.read_csv('Your_File.csv')
df = df[6:].reset_index()
df.columns = df.iloc[0]

df['Date'] = df['Date_Int'].apply(lambda x: xlrd.xldate.xldate_as_datetime(x, 0))

print(df)

   TARİH       Date
0     43891 2020-03-01
1     43892 2020-03-02
2     43893 2020-03-03
3     43894 2020-03-04
4     43895 2020-03-05

Further you can change the date format as per your requirement.

edited Mar 30, 2020 at 16:31

answered Mar 30, 2020 at 14:21

ManojK

1,6403 gold badges11 silver badges17 bronze badges

11 Comments

piseynir Over a year ago

Thanks, but this error occured; "KeyError: 'Date_Int" with your solution

ManojK Over a year ago

Change this df['Date'] = df['TARIH'].apply(lambda x: xlrd.xldate.xldate_as_datetime(x, 0)) because Date_Int is not the name of column in your file. It is a column which I added in my example data.

piseynir Over a year ago

File "pandas_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'TARİH' this error occured. 'TARİH' is not column name, maybe it is related with that.

piseynir Over a year ago

import pandas as pd import xlrd  df = pd.read_excel("read file.xlsb",sheet_name="Data", engine="pyxlsb") df.to_excel("2nd file.xlsx")  df['Date'] = df['TARİH'].apply(lambda x: xlrd.xldate.xldate_as_datetime(x, 0))

I' m doing like this, because after that i will use "openpyxl" and openpyxl doesn't work with .xlsb file. I will take specific rows and columns from "2nd file.xlsx" Here is my specific rows,columns copy method: yagisanatode.com/2017/11/18/…

ManojK Over a year ago

Looks like you are trying to read the excel file with some blank rows at the top as shown in the picture, please delete rows from 1 to 6 in excel file and read again, ensure the row no. 7 is the header in your pandas dataframe. Then run the code df['Date'] = df['TARİH'].apply(lambda x: xlrd.xldate.xldate_as_datetime(x, 0))

|

piseynir · Accepted Answer · 2020-03-31 18:09:49Z

0

Problem solved in very simple way, I shocked when I solved it.

1-Open "my analyse document.xlsx"

2-Select the whole "D" column

3-Click home button

4-Format cell-> Short Date

Even I update the excel data with python, column format type is not changing.

Thanks for all supports.

answered Mar 31, 2020 at 18:09

piseynir

2251 gold badge5 silver badges18 bronze badges

Collectives™ on Stack Overflow

How to change column format in excel with python

3 Answers 3

2 Comments

11 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

11 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related