2

Python 3.2.5 x64 ElementTree

I have data that I need to format using python. Essentially I have file with elements and subelements. I need to delete the child elements of some of these elements. I have checked previous questions and I couldn't make a solution. The best I had so far only removes every second child element.

Sample data:

<Leg1:MOR oCount="7" xmlns:Leg1="http://what.not">
    <Leg1:Order>
        <Leg1:CTemp id="FO">
            <Leg1:Group bNum="001" cCount="4">
                <Leg1:Dog ndate="112" pdate="111"/>
                <Leg1:Dog ndate="122" pdate="121"/>
                <Leg1:Dog ndate="132" pdate="131"/>
                <Leg1:Dog ndate="142" pdate="141"/>
            </Leg1:Group>
                <Leg1:Group bNum="002" cCount="4">
                <Leg1:Dog ndate="112" pdate="111"/>
                <Leg1:Dog ndate="122" pdate="121"/>
                <Leg1:Dog ndate="132" pdate="131"/>
                <Leg1:Dog ndate="142" pdate="141"/>
            </Leg1:Group>
        </Leg1:CTemp>
        <Leg1:CTemp id="GO">
            <Leg1:Group bNum="001" cCount="4">
                <Leg1:Dog ndate="112" pdate="111"/>
                <Leg1:Dog ndate="122" pdate="121"/>
                <Leg1:Dog ndate="132" pdate="131"/>
                <Leg1:Dog ndate="142" pdate="141"/>
            </Leg1:Group>
            <Leg1:Group bNum="002" cCount="4">
                <Leg1:Dog ndate="112" pdate="111"/>
                <Leg1:Dog ndate="122" pdate="121"/>
                <Leg1:Dog ndate="132" pdate="131"/>
                <Leg1:Dog ndate="142" pdate="141"/>
            </Leg1:Group>
        </Leg1:CTemp>
    </Leg1:Order>
</Leg1:MOR>

What I need the output to look like:

<Leg1:MOR oCount="7" xmlns:Leg1="http://what.not">
    <Leg1:Order>
        <Leg1:CTemp id="FO">
            <Leg1:Group bNum="001" cCount="10"/>
            <Leg1:Group bNum="002" cCount="10"/>
        </Leg1:CTemp>
        <Leg1:CTemp id="GO">
            <Leg1:Group bNum="001" cCount="10"/>
            <Leg1:Group bNum="002" cCount="10"/>
        </Leg1:CTemp>
    </Leg1:Order>
</Leg1:MOR>

I haven't written anything in a while and my code is useless. I can parse the file, and write it I cannot get the processing right.

import xml.etree.cElementTree as ET
tree = ET.parse("input.xml")
root = tree.getroot()
for x in root.findall('./Order/CTemp/Group'):
    root.remove(x)
tree.write("output.xml")

How do I get it remove the Dog children of the CTemp elements?

1
  • Try to use namespaces. Commented May 13, 2015 at 9:32

1 Answer 1

1

If you can use lxml, try this:

import lxml.etree

tree = lxml.etree.parse("leg.xml")
for dog in tree.xpath("//Leg1:Dog",
                      namespaces={"Leg1": "http://what.not"}):
    parent = dog.xpath("..")[0]
    parent.remove(dog)
    parent.text = None
tree.write("leg.out.xml")

Now leg.out.xml looks like this:

<?xml version="1.0"?>
<Leg1:MOR xmlns:Leg1="http://what.not" oCount="7">
  <Leg1:Order>
    <Leg1:CTemp id="FO">
      <Leg1:Group bNum="001" cCount="4"/>
      <Leg1:Group bNum="002" cCount="4"/>
    </Leg1:CTemp>
    <Leg1:CTemp id="GO">
      <Leg1:Group bNum="001" cCount="4"/>
      <Leg1:Group bNum="002" cCount="4"/>
    </Leg1:CTemp>
  </Leg1:Order>
</Leg1:MOR>
Sign up to request clarification or add additional context in comments.

8 Comments

Great thank you! One step closer. Now can you think of any way to concatenate the Group element from: <Leg1:Group bNum="001" cCount="4"></Leg1:Group> to <Leg1:Group bNum="001" cCount="4"/>
@LCGA I've improved my answer.
Awesome! Thank you so much. I hate to admit it but I was stuck on this for a full day yesterday.
A small side note that your parsing of the xml file produces this error: lxml.etree.XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1 This is a problem with larger files, I changed the: tree = lxml.etree.parse(open("leg.xml")) to tree = lxml.etree.parse("leg.xml")
So if I want to remove the Leg1: prefix from all the elements how would I go about doing that?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.