4

I'm trying to parse an XML from a string in Python with no success. The string I'm trying to parse is:

<?xml version="1.0" encoding="UTF-8"?>
<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="urn:uuid:573a453c-72c0-4185-8c54-9010593dd102">
   <data>
      <config xmlns="http://www.calix.com/ns/exa/base">
         <profile>
            <policy-map>
               <name>ELINE_PM_1</name>
               <class-map-ethernet>
                  <name>Eth-match-any-1</name>
                  <ingress>
                     <meter-type>meter-mef</meter-type>
                     <eir>1000000</eir>
                  </ingress>
               </class-map-ethernet>
            </policy-map>
            <policy-map>
               <name>ELINE_PM_2</name>
               <class-map-ethernet>
                  <name>Eth-match-any-2</name>
                  <ingress>
                     <meter-type>meter-mef</meter-type>
                     <eir>10000000</eir>
                  </ingress>
               </class-map-ethernet>
            </policy-map>
         </profile>
      </config>
   </data>
</rpc-reply>

I'm trying to use xml.etree.ElementTree library to parse the xml and I also tried to remove the first line related to xml version and encoding with no results. The code snippet to reproduce the issue I'm facing is:

import xml.etree.ElementTree as ET

reply_xml='''
<data>
   <config>
      <profile>
         <policy-map>
            <name>ELINE_PM_1</name>
            <class-map-ethernet>
               <name>Eth-match-any-1</name>
               <ingress>
                  <meter-type>meter-mef</meter-type>
                  <eir>1000000</eir>
               </ingress>
            </class-map-ethernet>
         </policy-map>
         <policy-map>
            <name>ELINE_PM_2</name>
            <class-map-ethernet>
               <name>Eth-match-any-2</name>
               <ingress>
                  <meter-type>meter-mef</meter-type>
                  <eir>10000000</eir>
               </ingress>
            </class-map-ethernet>
         </policy-map>
      </profile>
   </config>
</data>
'''

root = ET.fromstring(reply_xml)
for child in root:
    print(child.tag, child.attrib)

reply_xml is a string containing the above mentioned xml so it should work but if I inspect the root variable using the debugger I see that it is not being populated correctly. It seems that the first xml tag (<?xml version="1.0" encoding="UTF-8"?>) creates some problems but even if I manually remove it I am not able to parse the xml correctly.

Any clue to parse that xml?

6
  • 1
    what is the information you want to collect from this xml? Commented Oct 5, 2021 at 15:00
  • 1
    The <?xml .. ?> part is not a tag, but the XML declaration. And ElementTree can handle that perfectly. Commented Oct 5, 2021 at 15:00
  • The information I want to collect is the <eir></eir> tag. There are two different in this example but there may be more than two Commented Oct 5, 2021 at 15:02
  • I can't reproduce this. I get the output "config {}" and that's correct. With the full XML I get "{urn:ietf:params:xml:ns:netconf:base:1.0}data {}". Commented Oct 5, 2021 at 15:02
  • How are you obtaining the string? (Don't say you're reading it from an XML file) Commented Oct 5, 2021 at 15:03

3 Answers 3

6

Your original XML has namespaces. You need to honor them in your XPath queries.

import xml.etree.ElementTree as ET

reply_xml '''<?xml version="1.0" encoding="UTF-8"?>
<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="urn:uuid:573a453c-72c0-4185-8c54-9010593dd102">
   <data>
      <config xmlns="http://www.calix.com/ns/exa/base">
        <!-- ... the rest of it ... -->
      </config>
   </data>
</rpc-reply>'''

ns = {
    'calix': 'http://www.calix.com/ns/exa/base'
}

root = ET.fromstring(reply_xml)
for eir in root.findall('.//calix:eir', ns):
    print(eir.text)

prints

1000000
10000000
Sign up to request clarification or add additional context in comments.

Comments

2

Your code works fine. It shows all children of the root element, which is only <config> .. </config> and it has no attributes.

To get to the <eir> tag, you should use XPath, or go through the tree recursively.

Quick solution for XPath:

root.findall('.//eir')

1 Comment

That should not have worked, the OP's original XML has XML namespaces.
2

see below (1 liner with xpath)

import xml.etree.ElementTree as ET

reply_xml='''
<data>
   <config>
      <profile>
         <policy-map>
            <name>ELINE_PM_1</name>
            <class-map-ethernet>
               <name>Eth-match-any-1</name>
               <ingress>
                  <meter-type>meter-mef</meter-type>
                  <eir>1000000</eir>
               </ingress>
            </class-map-ethernet>
         </policy-map>
         <policy-map>
            <name>ELINE_PM_2</name>
            <class-map-ethernet>
               <name>Eth-match-any-2</name>
               <ingress>
                  <meter-type>meter-mef</meter-type>
                  <eir>20000000</eir>
               </ingress>
            </class-map-ethernet>
         </policy-map>
      </profile>
   </config>
</data>
'''

root = ET.fromstring(reply_xml)
eirs = [e.text for e in root.findall('.//eir')]
print(eirs)

output

['1000000', '20000000']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.