2

I am searching for a way to remove a specific tag <e> that has value as mmm within xml file (i.e <e>mmm</e>. I am referring to this thread as staring guide: How to remove elements from XML using Python without using lxml library instead of using ElementTree with python v2.6.6. I was trying to connect a dot with the thread and reading upon ElementTree api doc but I haven't been successful.

I appreciate your advice and thought on this.

<?xml version='1.0' encoding='UTF-8'?>
<parent>
   <first>
     <a>123</a>                              
     <c>987</c>
       <d>
         <e>mmm</e>
         <e>yyy</e>           
       </d>         
   </first>
   <second>
     <a>456</a>                      
     <c>345</c>
       <d>
         <e>mmm</e>
         <e>hhh</e>            
       </d>
   </second>
 </parent>

2 Answers 2

2

It took a while for me to realise all <e> tags are subnodes of <d>.

If we can assume the above is true for all your target nodes (<e> nodes with value mmm), you can use this script. (I added some extra nodes to check if it worked

import xml.etree.ElementTree as ET

xml_string = """<?xml version='1.0' encoding='UTF-8'?>
<parent>
   <first>
     <a>123</a>                              
     <c>987</c>
       <d>
         <e>mmm</e>
         <e>aaa</e>
         <e>mmm</e>
         <e>yyy</e>           
       </d>         
   </first>
   <second>
     <a>456</a>                      
     <c>345</c>
       <d>
         <e>mmm</e>
         <e>hhh</e>            
       </d>
   </second>
 </parent>"""

# this is how I create my root, if you choose to do it in a different way the end of this script might not be useful
root = ET.fromstring(xml_string)

target_node_first_parent = 'd'
target_node = 'e'
target_text = 'mmm'

# find all <d> nodes
for node in root.iter(target_node_first_parent):
    # find <e> subnodes of <d>
    for subnode in node.iter(target_node):
        if subnode.text == target_text:
            node.remove(subnode)

# output the result         
tree = ET.ElementTree(root)
tree.write('output.xml')

I tried to just remove nodes found by root.iter(yourtag) but apparently it's not possible from the root (apparently it was not that easy)

Sign up to request clarification or add additional context in comments.

Comments

1

The answer by @Queuebee is exactly correct but incase you want to read from a file, the code below provides a way to do that.

import xml.etree.ElementTree as ET

file_loc = " "
xml_tree_obj = ET.parse(file_loc)

xml_roots = xml_tree_obj.getroot()

target_node_first_parent = 'd'
target_node = 'e'
target_text = 'mmm'

# find all <d> nodes
for node in xml_roots.iter(target_node_first_parent):
    # find <e> subnodes of <d>
    for subnode in node.iter(target_node):
        if subnode.text == target_text:
            node.remove(subnode)

out_tree = ET.ElementTree(xml_roots)
out_tree.write('output.xml')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.