1

I'm trying to write a Python script that will go through Rows and put them into my database

This is a structure of my xml:

Root>
    -<SvcNf>
        -<PersonNf>
            -<PersonList>
                -<Row>
                    <SysName>MI6</SysName>
                     <ServerDt>2016-10-28 03:00:12 +03:00</ServerDt>
                     <UID>9457A55E17341AA7ASDEDS057A8BFFF3</UID>
                     <PersID>007</PersID>
                     <Emp_name>James Bond</Emp_name>
                     <EventID>25</EventID>
                     <EventTXT>Drinking alcohol</EventTXT>
                     <CauseEventID>03</CauseEventID>
                     <CauseEventTXT>Martini with vodka</CauseEventTXT>
                     <EventBegda>2017-10-18</EventBegda>
                     <EventEndda>2017-10-18</EventEndda>
                     <AccrualsSum>171.0</AccrualsSum>
                     <AccrualsProz>0.0</AccrualsProz>
                     <AccrualsName>Chinees_</AccrualsName>
                     <OrderNum>P-336</OrderNum>
                     <Perg>0</Persg>
                     <Perk>15</Persk>
                     <Awart/>
                 </Row>
                 -<Row>
                     .....
                 </Row>
                <Row/>
            </PersonList>
        </PersonNf>
    </SvcNf>
</Root>

So, when i use this code to Parse XML:

ResultSet_Py_List = []

root = ET.parse(events)
nodes = root.findall('.//Row')

for node in nodes:
    for child in node:
        ResultSet_Py_List.append(child.text)

The nominal Row value, which I want, is

['MI6', '2016-10-28 03:00:12 +03:00', '9457A55E17341AA7ASDEDS057A8BFFF3', etc]

[Row2]

[Row3]

Now, its like: [Row1,Row2,Row3]

[EDIT] All characters is like &#1060.
To heal it:

tostring(doc.getroot(), 'unicode')

[EDIT]

I used 2nd example but now i have Oracle error: ORA-01704: string literal too long.

5
  • @Максим Федоров, so after every Row ends its 'None' that separates rows right ? Commented Oct 30, 2017 at 17:43
  • @Parfait - Oracle 11G Database Commented Oct 30, 2017 at 17:48
  • @Chetan Vasudevan - not as usual, <Awart/> can be not null Commented Oct 30, 2017 at 17:49
  • @МаксимФедоров, then which can be a factor/key of seperation between the rows ? Commented Oct 30, 2017 at 17:55
  • <SysName>MI6</SysName> - This a seperation between Rows, always = constant Commented Oct 30, 2017 at 18:04

3 Answers 3

1

Maybe someone it will be helpfull. Its begginer answer :) I work with unicode(russian symbols) and I didnt i know why xml didnt normal parse with Oracle function. When I try do ALTERNATIVELY WITH NESTED LIST COMPREHENSION my kernel and python are died.

So, I do CSV from XML and with ALTERNATIVELY WITH NESTED LIST COMPREHENSION insert it into my database

import pandas as pd
import xml.etree.ElementTree as ET
import cx_Oracle
import csv


doc = ET.parse(events)
nodes = doc.findall('.//Row')
#RANGE(num) - COLUMN COUNTS
(SysName, ServerDt, UIDS, EmplID, ....T) = (list() for i in range(18))

for node in nodes:
    for elem in node.findall("*"):
        try:
            if elem.tag == "SysName":
                SysName.append(elem.text)            
            if elem.tag == "UID":
                UIDS.append(elem.text)     
            if elem.tag == "ServerDt":
                ServerDt.append(elem.text)
            if elem.tag == "PersID":
                PersID.append(elem.text)
            ...
            except AttributeError:
                print(elem.tag)

    #ER nonetype' object has no attribute 'text' python & 
    #ER arrays must all be same length
    s1 = pd.Series(SysName)
    s2 = pd.Series(ServerDt)
    s3 = pd.Series(UIDS)
    .....
    #TEMPORY CSV. DONT KNOW WHY, BUT COLUMNS ARE MIXED WITH THIS ITERATION
    df= pd.DataFrame({"SysName": s1,
                    "ServerDt": s2,
                    "UIDS": s3,
                    "PersID": s4,
                    "Emp_name": s5,
                    "EVENTID": s6,
                    ...})
    file_name = 'events.csv'
    df.to_csv(file_name, sep='\t')

    print("File name: ", my_file.name, "created")

    ResultSet_Py_List = []   

    ora_conn = cx_Oracle.connect('login/pass@TNSNAME')
    ora_cursor = ora_conn.cursor()

    my_file = open(file_name, 'r', newline='')
    #PANDA ER: 'utf8' codec can't decode byte 0xe9 in position 10: invalid continuation byte        
    reader = csv.reader(my_file,  dialect='excel', delimiter='\t' )

    for index, row in enumerate(reader):
    #Without header CSV
            if index > 0:
                    try:
            ResultSet_Py_List.append(row)
                    except AttributeError:
            pass

    print(str(len(ResultSet_Py_List)) + ' Records from Source')

    sql_del = """delete from HR.EVENTS_TST"""

    ora_cursor.execute(sql_del)
    ora_cursor.execute("commit")

    print("Table is clean") 
    #COLUMNS ARE MIXED IN CSV ITERATION
    sql_insert = """
            INSERT INTO HR.EVENTS_TST (ROW_NUM
                    ,ACCRUALSNAME
                    , ACCRUALSPROZ
                    , ACCRUALSSUM
                    , AWART
                    , CAUSEEVENTID
                    ...)
                        VALUES         (:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14
                        ,:15,:16,:17,:18, :19)
                        """  

    ora_cursor.prepare(sql_insert)

    ora_cursor.executemany(None, ResultSet_Py_List)
    ora_conn.commit()
    ora_cursor.execute("commit")
    print("Data imported")
Sign up to request clarification or add additional context in comments.

Comments

0

Consider using Oracle's XML handlers and avoid any nested looping:

SELECT  e.SysName, e.ServerDt, e."UID", e.PersID, e.Emp_name, e.EventID, e.EventTXT,
        e.CauseEventID, e.CauseEventTXT, e.EventBegda, e.EventEndda, 
        e.AccrualsSum, e.AccrualsProz, e.OrderNum, e.Perg, e.Perk, e.Awart

FROM  XMLTABLE('/Root/SvcNf/PersonNf/PersonList/Row' 
      PASSING XMLTYPE('<Root>
                         <SvcNf>
                            <PersonNf>
                                <PersonList>
                                    <Row>
                                        <SysName>MI6</SysName>
                                         <ServerDt>2016-10-28 03:00:12 +03:00</ServerDt>
                                         <UID>9457A55E17341AA7ASDEDS057A8BFFF3</UID>
                                         <PersID>007</PersID>
                                         <Emp_name>James Bond</Emp_name>
                                         <EventID>25</EventID>
                                         <EventTXT>Drinking alcohol</EventTXT>
                                         <CauseEventID>03</CauseEventID>
                                         <CauseEventTXT>Martini with vodka</CauseEventTXT>
                                         <EventBegda>2017-10-18</EventBegda>
                                         <EventEndda>2017-10-18</EventEndda>
                                         <AccrualsSum>171.0</AccrualsSum>
                                         <AccrualsProz>0.0</AccrualsProz>
                                         <AccrualsName>Chinees_</AccrualsName>
                                         <OrderNum>P-336</OrderNum>
                                         <Perg>0</Perg>
                                         <Perk>15</Perk>
                                         <Awart/>
                                     </Row>
                                </PersonList>
                            </PersonNf>
                         </SvcNf>
                      </Root>')
          COLUMNS  
              SysName   VARCHAR2(25) PATH 'SysName',  
              ServerDt  VARCHAR2(25) PATH 'ServerDt',  
              "UID"     VARCHAR2(25)       PATH 'UID',  
              PersID    VARCHAR2(25)  PATH 'PersID',
              Emp_name  VARCHAR2(25)  PATH 'Emp_name',
              EventID   NUMBER       PATH 'EventID',
              EventTXT  VARCHAR2(25)  PATH 'EventTXT',
              CauseEventID    VARCHAR2(25) PATH 'CauseEventID',
              CauseEventTXT   VARCHAR2(25) PATH 'CauseEventTXT',               
              EventBegda      VARCHAR2(25) PATH 'EventBegda',
              EventEndda      VARCHAR2(25) PATH 'EventEndda',
              AccrualsSum     VARCHAR2(25) PATH 'AccrualsSum',
              AccrualsProz    VARCHAR2(25) PATH 'AccrualsProz',
              OrderNum  VARCHAR2(25) PATH 'OrderNum',
              Perg      NUMBER PATH 'Perg',
              Perk      NUMBER PATH 'Perk',
              Awart     VARCHAR2(25) PATH 'Awart') AS e;

Rextester demo

To implement in a Python database cursor such as with cx_Oracle for an append query:

sql = """INSERT INTO mytable (Col1, Col2, Col3, ...)
         SELECT ...same as above...        
         FROM  XMLTABLE('/Root/SvcNf/PersonNf/PersonList/Row' 
                   PASSING XMLTYPE(:i_param)
                   COLUMNS  
                      ...same as above...) AS e
"""

# PARSE XML FILE
doc = ET.parse(events)
xmlstr = ET.tostring(doc.getroot()).decode('utf-8')

# PASS XML STRING AS PARAMETER
cur.execute(sql, {'i_param':xmlstr})
dbconn.commit()

For very large XML content that exceeds Oracle's 4,000 bytes in SQL, use PL/SQL with varchar2 limit of 32,767 bytes as shown here by @NickS:

sql = """
DECLARE
    xml_value varchar2(32767);
BEGIN
    xml_value := :i_param;

    INSERT INTO mytable (Col1, Col2, Col3, ...)
             SELECT ...same as above...        
             FROM  XMLTABLE('/Root/SvcNf/PersonNf/PersonList/Row' 
                       PASSING XMLTYPE(xml_value)
                       COLUMNS  
                          ...same as above...) AS e
    commit;
END;
"""

# PARSE XML FILE
doc = ET.parse(events)
xmlstr = ET.tostring(doc.getroot()).decode('utf-8')

# PASS XML STRING AS PARAMETER
cur.execute(sql, {'i_param':xmlstr})
dbconn.commit()

11 Comments

I used 2nd example but now i have Oracle error: ORA-01704: string literal too long.
thanks for answer, child is large or the /row part? doesnt understand i need to correct with declare my COLUMNS part or this PASSING XMLTYPE('{}') or just do like Emp_name VARCHAR2(32767) PATH 'Emp_name '
Entire XML content is large. How large is your file in MB?
its small, now its only 50 rows about 36.3 KB
Did you try my updated answer here? Both now use parameterization. Also, can you run the first SELECT in Oracle (outside Python) with actual XML? Does it error out there as well?
|
0

To use original setup, simply add an inner list and then append to larger, ResultSet_Py_List, where each <Row> is saved in a nested list for database insert.

import xml.etree.ElementTree as et

doc = et.parse('Source.xml')
nodes = doc.findall('.//Row')

ResultSet_Py_List = []

for node in nodes:
    inner = []
    for child in node:
        inner.append(child.text)
    ResultSet_Py_List.append(inner)

# ALTERNATIVELY WITH NESTED LIST COMPREHENSION
ResultSet_Py_List = [[child.text for child in node] for node in nodes]

Then run cx_Oracle's executemany where parameter placeholders are equal length to list items:

cursor.prepare("INSERT INTO myTable (Col1, Col2, Col3, ...) VALUES (:1, :2, :3, ...)")
cursor.executemany(None, ResultSet_Py_List)
db.commit()

1 Comment

When i do ora_cursor.prepare("INSERT INTO HR.EVENTS_TST (SYSNAME, PORTDATE, UID, TabNum, FIO ....) VALUES (:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14,:15,:16,:17,:18)") ora_cursor.executemany(None, ResultSet_Py_List) - the kernel and python are died. Have any ideas why?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.