0

I have a CSV of 74 fields of mix types including dates and 10 rows with data in some columns but not most. The CSV non-data cells are empty (no nan, NULL, or coded values for missing). I'm trying to push this into an established table in Oracle. By default, empties in a dataframe convert to 'nan' no matter the field format. The issue with this is that Oracle doesn't accept 'nan' values, it accepts NULL. You can not fill pandas empties with NULL, so that seems to be the problem. Using cx_oracle lib to create a connection to the db and have used it in other places with simple variable imputation, but this is a first for me for loading an entire df into Oracle.

I have tried using sqlalchemy to make the connection and its to_sql() function to convert the df to something Oracle will like, but running into db connection issues. Since I can connect with cx_oracle, that's what I'm pursuing here.

Can I not load a bunch of empty cells into an Oracle table? If I can't then how should I convert empty cells into NULL in a way that pandas will load NULL as NULL into Oracle?

When the below code is run, I get the "ORA-01722: invalid number" error. I understand why I'm getting this, that "NULL" is trying to load into a number field, so there's a mismatch. Question is, what is the proper way to do this, accounting for a diverse data frame?

import pandas as pd
surveyData = pd.read_csv(r"FM_Sample.csv", delimiter=',', index_col = False)
from datetime import datetime

surveyData.fillna("NULL", inplace = True)

def insertDFrecs():
    import cx_Oracle
    connstr = 'URL:port/dbname'
    conn = cx_Oracle.connect('user', 'pass', connstr)
    curs = conn.cursor()
    curs.execute(query)
    conn.commit()
    conn.close()

oracleFieldsList = ['FO_NUM', 'WR_ID' ... 'FILE_NO', 'UMW']
oracle_fields = ",".join(oracleFieldsList)
    
try:
    for i,row in surveyData.iterrows():
        FO_NUM = row['FO_NUM'] #NUMBER
        WR_ID = row['WR_ID'] #NUMBER
        ... = ...
        FILE_NO = row['FILE_NO']
        UMW = row['UMW']

        entryValuesList = [FO_NUM,  WR_ID, PDIV_ID ... str(FILE_NO), str(UMW)]
        entry_values = str(entryValuesList).strip("[]")
        
        sql = "INSERT INTO ORACLE_TABLE (" + oracle_fields + """) VALUES (""" + entry_values + ")"

except Exception as e:
    print(e)

try:
    insertDFrecs(sql)
except Exception as e:
    print(e)

I was expecting the data from the CSV to convert to a pandas df and then load into an Oracle table.

2
  • See this post: stackoverflow.com/questions/52259473/… It suggests that you convert your NULL strings to Python None type. Commented Jan 26, 2024 at 20:01
  • But where is the fillna() statement that converts to None supposed to go? In the top, it can apply to the dataframe, but once I start declaring local variables further down this breaks from the dataframe mold and the values just go back to NaN because they're vars not dataframe cells. In the "entryValuesList", all those variables reverted to NaN even though I applied the None to empty cells on the dataframe. Commented Jan 26, 2024 at 20:44

1 Answer 1

1

I don't see any issue loading empty fields as NULL when using code like load_csv.py, e.g like:

# Set cursor sizes to match the table definition or known max data sizes
#   create table test (id number, name varchar2(25));
cursor.setinputsizes(None, 25)

# Adjust the batch size to meet your memory and performance requirements
batch_size = 100

with open(FILE_NAME, 'r') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    sql = "insert into test (id,name) values (:1, :2)"
    data = []
    for line in csv_reader:
        data.append((line[0], line[1]))
        if len(data) % batch_size == 0:
            cursor.executemany(sql, data)
            data = []
    if data:
        cursor.executemany(sql, data)
    connection.commit()

With the data file:

1,Abel
2,Ande
3,Atkinson
4,
5,Baer
,Baida
7,Banda
,
9,Bell

This loads fine. Checking in SQL*Plus:

SQL> set NULL XXXX
SQL> select * from test;

        ID NAME
---------- -------------------------
         1 Abel
         2 Ande
         3 Atkinson
         4 XXXX
         5 Baer
XXXX       Baida
         7 Banda
XXXX       XXXX
         9 Bell
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.