0

xml file example:

<header>
<name>name</name>

<items>

<item>
<title>title</title>
<add>add</add>
</item>

<item>
<title>title</title>
<add>add</add>
</item>

</items>
</header>

I would like to parse the info into groups broken up by each header and subgroup item:

xml parse too:

name
----title
----add

----title
----add

next header

name
----tile
----add
----etc
----etc

if someone could post an example, preferable with elem tree iterparse its a large xml file...

my example that doesn't work is...

import xml.etree.cElementTree as etree
infile = open("c:/1.xml", 'r')
context = etree.iterparse(infile, events=("start", "end"))

for event, element in context:
    if event == "end":
        if element.tag == "header":
            print element.findtext('name')
        elif element.tag == "item":
            print element.findtext('title')
            print element.findtext('add')
8
  • You need to start by reading the documentation and trying to solve the problem on your own, then come back and show us what you've tried and where it's failing. Commented Aug 16, 2013 at 15:22
  • You should take a look at docs.python.org/2/library/xml.etree.elementtree.html (Its part of python) Commented Aug 16, 2013 at 15:24
  • To further larsks comment, What are you trying to parse it into? do you just want it to print, do you want to store as a string, should it be a dictionary of lists... Commented Aug 16, 2013 at 15:28
  • @ larsks - I have tried to do it... figured no need to post my code since its a mess and doesn't work... Commented Aug 16, 2013 at 15:59
  • @ Chris, just a print is fine... I would have no problems using it from that point... my problem is parsing the groups... Commented Aug 16, 2013 at 15:59

1 Answer 1

3

So, nice and simply, with the infile you provided:

import xml.etree.cElementTree as etree

for event, element in etree.iterparse("C:/1.xml"):
    if element.tag == "name":
        print element.text
    elif element.tag in ["title", "add"]:
        print "---" + element.text

this gives output:

name
----title
----add
----title
----add

I guess if you wanted a spacer between headers you'd just:

if element.tag == "header":
    print "\n"
Sign up to request clarification or add additional context in comments.

1 Comment

...doesn't work, at least for python 2.7 - the entire file is loaded still.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.