0

I'm trying to load a 1 GB size csv file (about 1 Million rows) into Oracle table, the code works fine as below:

import cx_Oracle 
import csv

connection1=cx_Oracle.connect("username", "password","ORADB")
cur1 = connection1.cursor()
with open("Test.csv", "r") as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
next(csv_reader)
i=0
sql1="""insert into TEST_CSV_FILE ( START_TS, END_TS,FLDR, TSK, INST, SRC_ERR,tgt_ERR) 
        values (:1, :2, :3, :4, :5, :6, :7)"""
list=[]
for lines1 in csv_reader:
    print(lines1)
    print(type(lines1))
    cur1.execute(sql1,lines1)
    i=i+1
    print(i)
cur1.close()
connection1.commit()
connection1.close()

However the performance has been horrible with load not completing even after 2 hours, is there an efficient way to load this file in quick time ? I went through documentation and tried with "executemany" instead of "execute" but I get below when I try with executemany:

    cur1.executemany(sql1,lines1)
cx_Oracle.DatabaseError: ORA-01036: illegal variable name/number

Any help would be greatly appreciated.

After help from Christopher and abhlb I updated code as below to batch and use executemany, however the performance is still bad with Job taking 2 hours with batch size of 3000, any enhancements I can make to improve the performance ?

import cx_Oracle
import csv
from datetime import datetime
start=datetime.now()
connection1=cx_Oracle.connect("username", "password","oradb")
cur1 = connection1.cursor()
with open("test_file.csv", "r") as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
i=0
batch_size=3000
sql1="""insert into TEST_CSV_FILE ( START_TS, END_TS,FLDR, TSK, INST, 
SRC_ERR,tgt_ERR) 
    values (:1, :2, :3, :4, :5, :6, :7)"""
data = []
for line in csv_reader:
    data.append((line[0],line[1],line[2],line[3],line[4],line[5],line[6]))
    if len(data) % batch_size == 0:
        cur1.executemany(sql1, data)
        data = []
if data:
    cur1.executemany(sql1, data)
connection1.commit()
cur1.close()
connection1.close()
print(datetime.now()-start)

1 Answer 1

1

You need to convert each row into a tuple of values

dataToInsert=[tuple(x) for x in csv_reader]
cur1.executemany(sql1,dataToInsert)
cur1.close()
connection1.commit()
connection1.close()
Sign up to request clarification or add additional context in comments.

3 Comments

@okkadu You will probably want to batch the records up and call executemany() several times. See the example on loading CSV files in the cx_Oracle documentation
@ChristopherJones Thank you for documentation Christopher, I have to batch them up.
@abhlib Thank you Abhlib, that's helpful, I think if I batch them I will be able to complete my Job, right now it's not able to batch all data on single go.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.