2

Peers,

Newbie here. Is there a way we can read data from excel file and load into Oracle table? Some sample python script would be of great help. I did code few lines to get acquainted as shown below.

P.S. Edit: I mean this is just my partial code. I am not sure how can I have 'insert statement' or 'create table' statement as part of this code in Oracle part. I want to load the data as it reads from excel in a loop for every column. TIA!

import openpyxl
import cx_Oracle

#Oracle connection starts here
connection = cx_Oracle.connect("<schema>", "<pwd>", "<hostname>/<sid/service>")
print("Database version:", connection.version)
print(cx_Oracle.version)
print(connection.current_schema)

# creating a table
create_table = """
CREATE TABLE test (
col1 VARCHAR2(50) NOT NULL,
col2 VARCHAR2(50) NOT NULL,
col3 VARCHAR2(50) NOT NULL,
col4 VARCHAR2(50) NOT NULL,
col5 VARCHAR2(50) NOT NULL,
col6 VARCHAR2(50) NOT NULL,
col7 VARCHAR2(50) NOT NULL
)
"""
from sys import modules
cursor.execute(create_table)    

from openpyxl import Workbook
wb = openpyxl.load_workbook('<name of the file>',data_only=True)
ws = wb['Sheet1']

x=1
m=1

# looping through each column
for j in range(2,ws.max_column+1):

   ID = m 
   col1 = ws.cell(row=x,column=j).value  
   m = m+1

   col2 = ws.cell(row=1, column=j).value

   col3 = ws.cell(row=2, column=j).value

   col4 = ws.cell(row=3,column=j).value

   col5 = ws.cell(row=4, column=j).value

   col6 = ws.cell(row=5, column=j).value

   col7 = ws.cell(row=6, column=j).value

   #looping through each row for each column      
   for i in range(1,ws.max_row+1):
         Cellval= ws.cell(row=i, column=j).value

# Inserting all the above variables for each column loop
insert_table="""
INSERT INTO test (col1,col2,col3,col4,col5,col6,col7)
VALUES ("""+col1+""",
"""+col2+""",
"""+col3+""",
"""+col4+""",
"""+col5+""",
"""+col6+""",
"""+col7+""")"""

cursor.execute(insert_table)

x=x+1

connection.close()

Am I getting it right?

3
  • 2
    did it work? if It didn't, what went wrong? Commented Oct 20, 2017 at 17:49
  • I mean this is just my partial code. I am not sure how can I have 'insert statement' or 'create table' statement as part of this code in Oracle part. I want to load the data as it reads from excel in a loop for every column. TIA! Commented Oct 20, 2017 at 19:25
  • create your table and add the structure of the table to what you have above. Then you can do an insert statement in the for loop and use an execute to preform or put it into an array and use execute_many. See the cx_oracle docs for help with that. Commented Oct 20, 2017 at 19:37

2 Answers 2

3

You might prefer a direct(without need of loop for each column of each row) and more performant(with use of cursor.executemany function) method along with the data analytics library(pandas) as follows :

import pandas as pd
import cx_Oracle    
connection = cx_Oracle.connect("<schema>", "<pwd>", "<hostname>/<sid/service>")
cursor = connection.cursor()    
file = r'C:\\path\\ToFile\\myFile.xlsx'        
tab_name = "TEST"
tab_exists = """
DECLARE
  v_exst INT;
BEGIN
  SELECT COUNT(*) 
    INTO v_exst 
    FROM cat 
   WHERE table_name = '"""+tab_name+"""' 
     AND table_type = 'TABLE';
  IF v_exst = 1 THEN
     EXECUTE IMMEDIATE('DROP TABLE """+tab_name+"""');
  END IF;   
END;
"""
cursor.execute(tab_exists)    
create_table = """
CREATE TABLE """+tab_name+""" (
       col1 VARCHAR2(50) NOT NULL,
       col2 VARCHAR2(50) NOT NULL,
       col3 VARCHAR2(50) NOT NULL,
       col4 VARCHAR2(50) NOT NULL,
       col5 VARCHAR2(50) NOT NULL,
       col6 VARCHAR2(50) NOT NULL,
       col7 VARCHAR2(50) NOT NULL
)    """    
cursor.execute(create_table)     
insert_table = "INSERT INTO "+tab_name+" VALUES (:1,:2,:3,:4,:5,:6,:7)"    
df = pd.read_excel(file)    
df_list = df.fillna('').values.tolist()    
cursor.executemany(insert_table,df_list)    
cursor.close()
connection.commit()
connection.close()

where also add the case of dropping the TEST table whether it exists(without forgetting to handling with care for dropping a table).

If you have more than one sheet, and need to insert all the contents of the sheets into the table, then replace the trailing part of the code, after creating table(cursor.execute(create_table)) with the below one :

xl = pd.ExcelFile(file)
ls = list(xl.sheet_names)
insert_table = "INSERT INTO "+tab_name+" VALUES(:1,:2,:3,:4,:5,:6,:7)"
for i in ls:
    df = pd.read_excel(file,sheet_name=i)
    df_list = df.fillna('').values.tolist()
    cursor.executemany(insert_table,df_list)    
cursor.close()
connection.commit()
connection.close()
Sign up to request clarification or add additional context in comments.

2 Comments

While a good extension to my much earlier one, you should mention you add another layer namely the data analytics library, pandas, to this solution.
Hi @Parfait , just newly saw this question, and replied as an alternative case. Thanks for the explanation that I've added briefly.
0

Consider the following changes:

  • Reversing your nested for loop by running down the rows and iteratively assigning column values and then append each row.
  • Indent your execute line in row-wise loop.
  • Use .commit() for any action queries like CREATE TABLE and INSERT INTO to propagate changes.
  • Parameterize your query by using the second argument in .execute(query, params) which not only insulates SQL injection (in case Excel cells have malicious code by a clever user) but avoids string concatenation and quote enclosure for cleaner code. See Oracle+Python docs.

Adjusted code

# looping through each row for each column      
for i in range(1, ws.max_row+1):
   for j in range(2, ws.max_column+1):       
      col1 = ws.cell(row=i, column=j).value  
      col2 = ws.cell(row=i, column=j).value
      col3 = ws.cell(row=i, column=j).value
      col4 = ws.cell(row=i, column=j).value
      col5 = ws.cell(row=i, column=j).value
      col6 = ws.cell(row=i, column=j).value
      col7 = ws.cell(row=i, column=j).value

   insert_table = "INSERT INTO test (col1, col2, col3, col4, col5, col6, col7)" + \
                  " VALUES (:1, :2, :3, :4, :5, :6, :7)"

   cursor.execute(insert_table, (col1, col2, col3, col4, col5, col6, col7))
   connection.commit()

connection.close()

3 Comments

Do NOT use repeated calls to execute() because it is very inefficient, particularly for large data sets. Instead use executemany() as shown in the other answer. Look at the benchmark graph in blogs.oracle.com/opal/…
@ChristopherJones...Agreed. Given this answer was nearly three years ago, I certainly learned many better nuances with Python's DB-API. If I had to update all my past answers for better tweaks, I would not have time for other things. I hope future readers are not deterred by the downvote as some elements of this answer holds merit like advising OP on parameterization.
Agreed. For future readers: the cx_Oracle doc on 'Batch Statement Execution and Bulk Loading' is at cx-oracle.readthedocs.io/en/latest/user_guide/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.