0

I have an XML file like this i need to insert this data to PostgreSQL DB.Below is the sample XML and the code which i use ,but i'm not getting any output,Can someone please guide on how to effectively fetch these XML values.

<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:g="http://base.google.com/ns/1.0" version="2.0" encoding="utf-8">
  <config>
    <g:system>Magento</g:system>
    <g:extension>Magmodules_Googleshopping</g:extension>
    <g:extension_version>1.6.8</g:extension_version>
    <g:store>emb</g:store>
    <g:url>https://www.xxxxx.com/</g:url>
    <g:products>1320</g:products>
    <g:generated>2020-06-11 11:18:32</g:generated>
    <g:processing_time>17.5007</g:processing_time>
  </config>
  <channel>
    <item>
      <g:id>20</g:id>
      <g:title>product 1</g:title>
      <g:description>description about product 1</g:description>
      <g:gtin>42662</g:gtin>
      <g:brand>company</g:brand>
      <g:mpn>0014</g:mpn>
      <g:link>link.html</g:link>
      <g:image_link>link/c/a/cat_21_16.jpg</g:image_link>
      <g:availability>in stock</g:availability>
      <g:condition>new</g:condition>
      <g:price>9</g:price>
      <g:shipping>
        <g:country>UAE</g:country>
        <g:service>DHL</g:service>
        <g:price>2.90</g:price>
      </g:shipping>
    </item>
    <item>
     .
     .
     .
   </item>

Below is the script which i use, Python : 3.5 Postgres version 11

    # import modules
    import sys
    import psycopg2
    import datetime
    now = datetime.datetime.now()
    # current data and time
    dt = now.strftime("%Y%m%dT%H%M%S")

    # xml tree access
    #from xml.etree import ElementTree
    import xml.etree.ElementTree as ET

    # incremental variable
    x = 0
    with open('/Users/admin/documents/shopping.xml', 'rt',encoding="utf8") as f:
            #tree = ElementTree.parse(f)
            tree = ET.parse(f)
    # connection to postgreSQL database
    try:
        conn=psycopg2.connect(host='localhost', database='postgres',
                                       user='postgres', password='postgres',port='5432')
    except:
        print ("Hey I am unable to connect to the database.")
    cur = conn.cursor()
    # access the xml tree element nodes
    try:
        for node in tree.findall('.//item'):
            src = node.find('id')
            tgt = node.find('mpn')
            print(node)

    except:
        print ("Oops I can't insert record into database table!")
        conn.commit()
        conn.close()

The current output i'm getting is like,

None
None
None

Expected Output,

id title       description    gtin ......
20 product 1   g:description  xxxx .....
2
  • I used ET.fromstring("...") and I have no problem to get item - so maybe check what you read from file - maybe you read file which has no item. BTW" problem is id,tgt because it uses namespace g: Commented Jun 11, 2020 at 12:54
  • @furas tried with g: sill same, Can you please let me know where i have to change? Commented Jun 11, 2020 at 13:00

1 Answer 1

1

Strange is that you can't find item. It seems you use wrong file and it doesn't have item.

Using your XML data as string and ET.fromstring() I have no problem to get item.

Maybe check print( f.read() ) to see what you really read from file.


Problem is only id, tgt which use namespace - g: - and it need something more then only g:id, g:tgt

tree = ET.fromstring(xml)

ns = {'g': "http://base.google.com/ns/1.0"}

for node in tree.findall('.//item'):

    src = node.find('g:id', ns)
    tgt = node.find('g:mpn', ns)

    print('Node:', node)
    print('src:', src.text)
    print('tgt:', tgt.text)

or use directly as '{http://base.google.com/ns/1.0}id' '{http://base.google.com/ns/1.0}mpn'

tree = ET.fromstring(xml)

for node in tree.findall('.//item'):

    src = node.find('{http://base.google.com/ns/1.0}id')
    tgt = node.find('{http://base.google.com/ns/1.0}mpn')

    print('Node:', node)
    print('src:', src.text)
    print('tgt:', tgt.text)

Minimal working code:

import xml.etree.ElementTree as ET

xml = '''<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:g="http://base.google.com/ns/1.0" version="2.0" encoding="utf-8">
  <config>
    <g:system>Magento</g:system>
    <g:extension>Magmodules_Googleshopping</g:extension>
    <g:extension_version>1.6.8</g:extension_version>
    <g:store>emb</g:store>
    <g:url>https://www.xxxxx.com/</g:url>
    <g:products>1320</g:products>
    <g:generated>2020-06-11 11:18:32</g:generated>
    <g:processing_time>17.5007</g:processing_time>
  </config>
  <channel>
    <item>
      <g:id>20</g:id>
      <g:title>product 1</g:title>
      <g:description>description about product 1</g:description>
      <g:gtin>42662</g:gtin>
      <g:brand>company</g:brand>
      <g:mpn>0014</g:mpn>
      <g:link>link.html</g:link>
      <g:image_link>link/c/a/cat_21_16.jpg</g:image_link>
      <g:availability>in stock</g:availability>
      <g:condition>new</g:condition>
      <g:price>9</g:price>
      <g:shipping>
        <g:country>UAE</g:country>
        <g:service>DHL</g:service>
        <g:price>2.90</g:price>
      </g:shipping>
    </item>
  </channel>
</rss>    
'''

tree = ET.fromstring(xml)

ns = {'g': "http://base.google.com/ns/1.0"}

for node in tree.findall('.//item'):

    src = node.find('g:id', ns)
    tgt = node.find('g:mpn', ns)

    print('Node:', node)
    print('src:', src.text)
    print('tgt:', tgt.text)

Result:

Node: <Element 'item' at 0x7f74ba45b710>
src: 20
tgt: 0014

BTW: It works even when I use io.StringIO to simulate file

f = io.StringIO(xml)
tree = ET.parse(f)

Minimal working code:

import xml.etree.ElementTree as ET
import io

xml = '''<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:g="http://base.google.com/ns/1.0" version="2.0" encoding="utf-8">
  <config>
    <g:system>Magento</g:system>
    <g:extension>Magmodules_Googleshopping</g:extension>
    <g:extension_version>1.6.8</g:extension_version>
    <g:store>emb</g:store>
    <g:url>https://www.xxxxx.com/</g:url>
    <g:products>1320</g:products>
    <g:generated>2020-06-11 11:18:32</g:generated>
    <g:processing_time>17.5007</g:processing_time>
  </config>
  <channel>
    <item>
      <g:id>20</g:id>
      <g:title>product 1</g:title>
      <g:description>description about product 1</g:description>
      <g:gtin>42662</g:gtin>
      <g:brand>company</g:brand>
      <g:mpn>0014</g:mpn>
      <g:link>link.html</g:link>
      <g:image_link>link/c/a/cat_21_16.jpg</g:image_link>
      <g:availability>in stock</g:availability>
      <g:condition>new</g:condition>
      <g:price>9</g:price>
      <g:shipping>
        <g:country>UAE</g:country>
        <g:service>DHL</g:service>
        <g:price>2.90</g:price>
      </g:shipping>
    </item>
  </channel>
</rss>    
'''

f = io.StringIO(xml)
tree = ET.parse(f)

ns = {'g': "http://base.google.com/ns/1.0"}

for node in tree.findall('.//item'):
    src = node.find('{http://base.google.com/ns/1.0}id')
    tgt = node.find('{http://base.google.com/ns/1.0}mpn')
    print('Node:', node)
    print('src:', src.text)
    print('mpn:', tgt.text)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.