Python XML parse ElementTree complex XML structure

Question

I currently need to parse an XML document in python. However, I am struggling with the python libraries and this rather complex xml.

I have looked at the method used here: python read complex xml with ElementTree but it does not seem to work ?

I am using Python 2.7.7

The XML is taken from http://nvd.nist.gov/download.cfm#CURRENTXML and for instance one entry that I needs to parse looks like this: http://pastebin.com/qdPN98VX

My relevant code looks likes this at the moment. I can successfully read the ID of the first entry, however, everything within the elment is not accessable. I am also not sure whether the ElementTree is the best option for a 50mb file ? :

from vulnsdb.models import Vuln as CVE


file = 'CVE/20140630-NVDCE-2.0-2014.xml'

tree = ET.parse(file)
root = tree.getroot()

for entry in root:
    c = CVE()
    c.name = entry.attrib['id']
    for details in entry:
        if details.find("{http://scap.nist.gov/schema/vulnerability/0.4}cve-id"):
            print details.find("{http://scap.nist.gov/schema/vulnerability/0.4}cve-id").text
    break

alecxe · Accepted Answer · 2014-06-30 14:54:23Z

2

You can use xml.etree.ElementTree.iterparse() that parses the tree incrementally:

import xml.etree.ElementTree as ET


TAG = '{http://scap.nist.gov/schema/feed/vulnerability/2.0}entry'
ID = "CVE-2014-0001"

tree = ET.iterparse(open('CVE/20140630-NVDCE-2.0-2014.xml'))
for event, element in tree:
    if event == 'end' and element.tag == TAG and element.attrib.get('id') == ID:
        print ET.tostring(element)
        break

edited Jun 30, 2014 at 14:54

answered Jun 30, 2014 at 14:41

alecxe

476k127 gold badges1.1k silver badges1.2k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python XML parse ElementTree complex XML structure

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related