0

I have a HTTP request Get which I know it's response data looks like

<?xml version="1.0" encoding="utf-8"?>
<TestSpec xmlns="TestSpec.xsd">
  <Tests>
    <TestCase>
      <TestCase>abc</TestCase>
    </TestCase>
  </Tests>
</TestSpec>

I am trying to retrieve it's data so I can validate, the response data which is g0

    parser = etree.HTMLParser()
    tree=etree.fromstring(g0.text.encode('utf8'), parser)

How can I get those data I ahve tried

    print ("\ntree= "+ str(tree.TestCase))

but it doesn't work

1 Answer 1

1

Assuming g0.text.encode('utf8') is returning the example XML string you gave, then you shouldn't need to use the HTML parser. Try something like this:

tests = etree.fromstring(g0.text.encode('utf8'))[0] # Notice the "[0]" here

for testCase in tests.findall("{TestSpec.xsd}TestCase"): # Notice the namespace here
    print(testCase[0].text)

In the code above, I used [0] to get the 0th (first) child of the root element, which is the <Tests> tag. The for loop uses findall which returns all of the children directly under the Tests tag that match the given tag name. Notice that I included the namespace from the top level <TestSpec> tag in brackets ahead of the tag name. In this case findall returns all of the outer <TestCase> tags. Inside the loop I used [0] again which is the first child of the <TestCase> tag; in this case, the inner <TestCase>. Finally .text is the attribute that contains "abc".

This will work for XML like this:

<?xml version="1.0" encoding="utf-8"?>
<TestSpec xmlns="TestSpec.xsd">
  <Tests>
    <TestCase>
      <TestCase>abc</TestCase>
    </TestCase>
    <TestCase>
      <TestCase>def</TestCase>
    </TestCase>
    <TestCase>
      <TestCase>ghi</TestCase>
    </TestCase>
  </Tests>
</TestSpec>

But not like this:

<?xml version="1.0" encoding="utf-8"?>
<TestSpec xmlns="TestSpec.xsd">
  <Tests>
    <TestCase>
      <TestCase>abc</TestCase>
      <TestCase>def</TestCase>
      <TestCase>ghi</TestCase>
    </TestCase>
  </Tests>
</TestSpec>

In which case you would need to use findall or some other means of iterating instead of hardcoding the [0] for the first child. Something like this:

for outerTestCaseTag in tests.findall("{TestSpec.xsd}TestCase"):
    for innerTestCase in outerTestCaseTag.findall("{TestSpec.xsd}TestCase"):
        print(innerTestCase.text) 

Same goes for the first [0] to get the <Tests> tag. I'm sure there is some better "pythonic" loop comprehension to improve that nested loop, but that is the general idea.

Sign up to request clarification or add additional context in comments.

2 Comments

My situation is the 2nd you wrote above so I tried [0] it didnt work
Thank you I managed to get it working with your help, tests = etree.fromstring(g0.text.encode('utf8'))[0] # Notice the "[0]" here for testCase in tests.findall("{TestSpec.xsd}TestCase"): # Notice the namespace here print(testCase[0].text) is the answer as it turns out then --> testCase[1].text is next line

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.