0

I'm loading the excel(xlsx) file exported from dashboard in OpenPyxl and some of date values are defaulting into 44774.375 and it should be converted to 01-08-2022 09:00:00 to load into SF table.

Also In dashboard Users are giving 4/21/2023 10:00:00 PM this format but I should convert into 21-04-2023 10:00:00

not sure how to apply this both formatting condition in columns FE RETA, RETA.

I'm currently using the below code,

import pandas as pd
from openpyxl import load_workbook

wb = load_workbook(filename= "file.xlsx", data_only = True)
sheet_names = wb.get_sheet_names()
name = sheet_names[0]
sheet_ranges = wb[name]
df = pd.DataFrame(sheet_ranges.values, index = none)
print(df)

#To replace (4/21/2023 10:00:00 PM) format to (21-04-2023 10:00:00)
df['FE RETA'] = pd.to_datetime(df['FE RETA'], format = '%m/%d/%Y %I:%M:%S %p', errors='coerce')
df['FE RETA'] =df['FE RETA'].dt.strftime('%d-%m-%Y %H:%m:%S')

#To replace all remaining number format to (21-04-2023 10:00:00) 
#but its only working if the entire column is in number format
df['FE RETA'] = pd.to_datetime(df['FE RETA'],unit='d') + dt.datetime(1899,12,30)
print(df)

but however my date columns having also proper datetime format values too. So it throwing me the error.

TypeError: Invalid type for timedelta scalar: <class 'datetime.datetime'>

Source file:

Tht|       FE RETA               |  RETA                | TPM SLACycletime
---------------------------------------------------------------------------
US |    4/21/2023 10:00:00 PM    |4/21/2023 10:30:00 AM | Invalid Data
US |    4/22/2023 11:45:00 PM    |44781.54167           | 558:19:30
US |    4/21/2023 10:30:00 AM    |10-09-2022 18:03:00   | 111:44:26
US |    01-08-2022 10:00:00      |44778.41667           | 15:44:26
US |    44774.375                |44775.52083           | Invalid Data

Desired Output:

Tht|       FE RETA               |  RETA                | TPM SLACycletime
---------------------------------------------------------------------------
US |    21-04-2023 10:00:00      |21-04-2023 10:30:00   | Invalid Data
US |    22-04-2023 11:45:00      |08-08-2022 13:00:00   | 558:19:30
US |    21-04-2023 10:30:00      |10-09-2022 18:03:00   | 111:44:26
US |    01-08-2022 10:00:00      |05-08-2022 10:00:00   | 15:44:26
US |    01-08-2022 09:00:00      |02-08-2022 12:30:00   | Invalid Data

User is not agreeing to correct it while uploading, he's expecting to clean the date format in automation itself. Please help on this.

Thanks

0

1 Answer 1

1

When converting excel datetimes there is actually a different function: pd.to_timedelta(). To deal with which values should receive which function, I would create a mask and then invert it (or create another mask if there are more than 2 formats).

def func(col):


    #Check for the numerical excel date format
    mask = pd.to_numeric(df[col], errors='coerce').notna()
    
    #Convert excel dates to datetime
    df.loc[mask, col] = pd.to_timedelta(df.loc[mask, col], unit='d') + pd.Timestamp('1899-12-30')
    
    #Convert the string date format to datetime
    df.loc[~mask, col] = pd.to_datetime(df.loc[~mask, col], errors='coerce', format='%m/%d/%Y %I:%M:%S %p')
    #Convert to your desired format
    df[col] =df[col].dt.strftime('%d/%m/%Y %I:%M:%S %p')

Hopefully that works for you

Edit: To help with editing the date formats, use this documentation and change as you like

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks Alex. It's working great. However the time value is not coming as precisely as input file. Like 4/21/2023 10:30:00 AM instead it's coming as 21-04-2023 10:04:00.
Also Alex, could you please make it as function I need to use it for other columns too. So I can directly call the function.
@DevarajManiMaran Try that now. To add the PM I believe it's %p in the formatting. As for the time offset, I am not sure what is causing that. Likely weird formatting in the data or pd.to_datetime not behaving as we'd expect. I suggest you debug the code and see which dates are the ones with imprecision errors, and where that may come about.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.