I have a CSV of 74 fields of mix types including dates and 10 rows with data in some columns but not most. The CSV non-data cells are empty (no nan, NULL, or coded values for missing). I'm trying to push this into an established table in Oracle. By default, empties in a dataframe convert to 'nan' no matter the field format. The issue with this is that Oracle doesn't accept 'nan' values, it accepts NULL. You can not fill pandas empties with NULL, so that seems to be the problem. Using cx_oracle lib to create a connection to the db and have used it in other places with simple variable imputation, but this is a first for me for loading an entire df into Oracle.
I have tried using sqlalchemy to make the connection and its to_sql() function to convert the df to something Oracle will like, but running into db connection issues. Since I can connect with cx_oracle, that's what I'm pursuing here.
Can I not load a bunch of empty cells into an Oracle table? If I can't then how should I convert empty cells into NULL in a way that pandas will load NULL as NULL into Oracle?
When the below code is run, I get the "ORA-01722: invalid number" error. I understand why I'm getting this, that "NULL" is trying to load into a number field, so there's a mismatch. Question is, what is the proper way to do this, accounting for a diverse data frame?
import pandas as pd
surveyData = pd.read_csv(r"FM_Sample.csv", delimiter=',', index_col = False)
from datetime import datetime
surveyData.fillna("NULL", inplace = True)
def insertDFrecs():
import cx_Oracle
connstr = 'URL:port/dbname'
conn = cx_Oracle.connect('user', 'pass', connstr)
curs = conn.cursor()
curs.execute(query)
conn.commit()
conn.close()
oracleFieldsList = ['FO_NUM', 'WR_ID' ... 'FILE_NO', 'UMW']
oracle_fields = ",".join(oracleFieldsList)
try:
for i,row in surveyData.iterrows():
FO_NUM = row['FO_NUM'] #NUMBER
WR_ID = row['WR_ID'] #NUMBER
... = ...
FILE_NO = row['FILE_NO']
UMW = row['UMW']
entryValuesList = [FO_NUM, WR_ID, PDIV_ID ... str(FILE_NO), str(UMW)]
entry_values = str(entryValuesList).strip("[]")
sql = "INSERT INTO ORACLE_TABLE (" + oracle_fields + """) VALUES (""" + entry_values + ")"
except Exception as e:
print(e)
try:
insertDFrecs(sql)
except Exception as e:
print(e)
I was expecting the data from the CSV to convert to a pandas df and then load into an Oracle table.