3

I'm pulling xml from a SOAP api that looks like this:

<SOAP-ENV:Envelope xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ae="urn:sbmappservices72" xmlns:c14n="http://www.w3.org/2001/10/xml-exc-c14n#" xmlns:diag="urn:SerenaDiagnostics" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd" xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd" xmlns:xenc="http://www.w3.org/2001/04/xmlenc#" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<SOAP-ENV:Header/>
<SOAP-ENV:Body>
    <ae:GetItemsByQueryResponse>
      <ae:return>
        <ae:item>
          <ae:id xsi:type="ae:ItemIdentifier">
            <ae:displayName/>
            <ae:id>10</ae:id>
            <ae:uuid>a9b91034-8f4d-4043-b9b6-517ba4ed3a33</ae:uuid>
            <ae:tableId>1541</ae:tableId>
            <ae:tableIdItemId>1541:10</ae:tableIdItemId>
            <ae:issueId/>
          </ae:id>

I can't for the life of me use findall to pull something like tableId. Most of the tutorials on parsing using lxml don't include namespaces, but the one at lxml.de does, and I've been trying to follow it.

According to their tutorial you should create a dictionary of the namespaces, which I've done like so:

r = tree.xpath('/e:SOAP-ENV/s:ae', 
        namespaces={'e': 'http://schemas.xmlsoap.org/soap/envelope/',
                    's': 'urn:sbmappservices72'})

But that appears to not be working, as when I try to get the len of r, it comes back as 0:

print 'length: ' + str(len(r)) #<---- always equals 0

Since the URI for the second namespace is a "urn:", I tried using a real URL to the wsdl as well, but that gives me the same result.

Is there something obvious that I'm missing? I just need to be able to pull values like the one for tableIdItemId.

Any help would be greatly appreciated.

2
  • The fact that one of the namespaces is defined using a URN does not matter. It is just as valid as a URL. Commented Jul 25, 2015 at 10:14
  • Thanks, I had assumed so, but since it wasn't working I wasn't certain. I'd give you an upvote if I could :] Commented Jul 26, 2015 at 20:18

1 Answer 1

2

Your XPath doesn't correctly corresponds to the XML structure. Try this way instead :

r = tree.xpath('/e:Envelope/e:Body/s:GetItemsByQueryResponse/s:return/s:item/s:id/s:tableId', 
        namespaces={'e': 'http://schemas.xmlsoap.org/soap/envelope/',
                    's': 'urn:sbmappservices72'})

For small XML, you may want to use // instead of / to simplify the expression, for example :

r = tree.xpath('/e:Envelope/e:Body//s:tableId', 
        namespaces={'e': 'http://schemas.xmlsoap.org/soap/envelope/',
                    's': 'urn:sbmappservices72'})

/e:Body//s:tableId will find tableId no matter how depth it is nested within Body. Note however that // surely slower than / especially when applied for a huge XML.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.