python HTTP response data in XML

Question

I have a HTTP request Get which I know it's response data looks like

<?xml version="1.0" encoding="utf-8"?>
<TestSpec xmlns="TestSpec.xsd">
  <Tests>
    <TestCase>
      <TestCase>abc</TestCase>
    </TestCase>
  </Tests>
</TestSpec>

I am trying to retrieve it's data so I can validate, the response data which is g0

    parser = etree.HTMLParser()
    tree=etree.fromstring(g0.text.encode('utf8'), parser)

How can I get those data I ahve tried

    print ("\ntree= "+ str(tree.TestCase))

but it doesn't work

Alex · Accepted Answer · 2019-03-27 20:09:29Z

1

Assuming g0.text.encode('utf8') is returning the example XML string you gave, then you shouldn't need to use the HTML parser. Try something like this:

tests = etree.fromstring(g0.text.encode('utf8'))[0] # Notice the "[0]" here

for testCase in tests.findall("{TestSpec.xsd}TestCase"): # Notice the namespace here
    print(testCase[0].text)

In the code above, I used [0] to get the 0th (first) child of the root element, which is the <Tests> tag. The for loop uses findall which returns all of the children directly under the Tests tag that match the given tag name. Notice that I included the namespace from the top level <TestSpec> tag in brackets ahead of the tag name. In this case findall returns all of the outer <TestCase> tags. Inside the loop I used [0] again which is the first child of the <TestCase> tag; in this case, the inner <TestCase>. Finally .text is the attribute that contains "abc".

This will work for XML like this:

<?xml version="1.0" encoding="utf-8"?>
<TestSpec xmlns="TestSpec.xsd">
  <Tests>
    <TestCase>
      <TestCase>abc</TestCase>
    </TestCase>
    <TestCase>
      <TestCase>def</TestCase>
    </TestCase>
    <TestCase>
      <TestCase>ghi</TestCase>
    </TestCase>
  </Tests>
</TestSpec>

But not like this:

<?xml version="1.0" encoding="utf-8"?>
<TestSpec xmlns="TestSpec.xsd">
  <Tests>
    <TestCase>
      <TestCase>abc</TestCase>
      <TestCase>def</TestCase>
      <TestCase>ghi</TestCase>
    </TestCase>
  </Tests>
</TestSpec>

In which case you would need to use findall or some other means of iterating instead of hardcoding the [0] for the first child. Something like this:

for outerTestCaseTag in tests.findall("{TestSpec.xsd}TestCase"):
    for innerTestCase in outerTestCaseTag.findall("{TestSpec.xsd}TestCase"):
        print(innerTestCase.text)

Same goes for the first [0] to get the <Tests> tag. I'm sure there is some better "pythonic" loop comprehension to improve that nested loop, but that is the general idea.

edited Mar 27, 2019 at 20:09

answered Mar 26, 2019 at 23:09

Alex

8278 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Ghost Over a year ago

My situation is the 2nd you wrote above so I tried [0] it didnt work

Ghost Over a year ago

Thank you I managed to get it working with your help, tests = etree.fromstring(g0.text.encode('utf8'))[0] # Notice the "[0]" here for testCase in tests.findall("{TestSpec.xsd}TestCase"): # Notice the namespace here print(testCase[0].text) is the answer as it turns out then --> testCase[1].text is next line

Collectives™ on Stack Overflow

python HTTP response data in XML

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related