I currently need to parse an XML document in python. However, I am struggling with the python libraries and this rather complex xml.
I have looked at the method used here: python read complex xml with ElementTree but it does not seem to work ?
I am using Python 2.7.7
The XML is taken from http://nvd.nist.gov/download.cfm#CURRENTXML and for instance one entry that I needs to parse looks like this: http://pastebin.com/qdPN98VX
My relevant code looks likes this at the moment. I can successfully read the ID of the first entry, however, everything within the elment is not accessable. I am also not sure whether the ElementTree is the best option for a 50mb file ? :
from vulnsdb.models import Vuln as CVE
file = 'CVE/20140630-NVDCE-2.0-2014.xml'
tree = ET.parse(file)
root = tree.getroot()
for entry in root:
c = CVE()
c.name = entry.attrib['id']
for details in entry:
if details.find("{http://scap.nist.gov/schema/vulnerability/0.4}cve-id"):
print details.find("{http://scap.nist.gov/schema/vulnerability/0.4}cve-id").text
break