1

I have this XML file :

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<feed xml:base="https://receasy1p1942606901trial.hanatrial.ondemand.com:443/rec/Accrual_PO.xsodata/"
    xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices"
    xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"
    xmlns="http://www.w3.org/2005/Atom">
    <title type="text">accruals_po</title>
    <id>https://receasy1p1942606901trial.hanatrial.ondemand.com:443/rec/Accrual_PO.xsodata/accruals_po</id>
    <author>
        <name />
    </author>
    <link rel="self" title="accruals_po" href="accruals_po" />
    <entry>
        <id>https://receasy1p1942606901trial.hanatrial.ondemand.com:443/rec/Accrual_PO.xsodata/accruals_po('96372537-120')</id>
        <title type="text"></title>
        <author>
            <name />
        </author>
        <link rel="edit" title="accruals_po" href="accruals_po('96372537-120')"/>
        <category term="receasy.accruals_poType" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
        <content type="application/xml">
            <m:properties>
                <d:PO_NUMBER m:type="Edm.String">96372537-120</d:PO_NUMBER>
                <d:SAP_AMT m:type="Edm.Single">109</d:SAP_AMT>
                <d:GL_ACCOUNT m:type="Edm.Int64">65009000</d:GL_ACCOUNT>
                <d:COMPANY_CODE m:type="Edm.String">US10_OH</d:COMPANY_CODE>
                <d:CONFIRMED_ACCRUAL_AMT m:type="Edm.Single">109</d:CONFIRMED_ACCRUAL_AMT>
                <d:FINAL_APPROVER m:type="Edm.String">europe\bamcguir</d:FINAL_APPROVER>
                <d:FINAL_GL_ACCOUNT m:type="Edm.Int64">65009000</d:FINAL_GL_ACCOUNT>
                <d:FINAL_COMPANY_CODE m:type="Edm.String">US10_OH</d:FINAL_COMPANY_CODE>
                <d:RECONCILIATION m:type="Edm.String">Successful</d:RECONCILIATION>
            </m:properties>
        </content>
    </entry>
</feed>

I'm trying to get the values highlighted below in bold, they are under the entry tag.

96372537-120

109

65009000

US10_OH

109

europe\bamcguir

65009000

US10_OH

Successful

This is the code I have as of now to get the values.

import urllib2
import xmltodict
import xml.etree.ElementTree as ET
import requests

tree = ET.parse('export.xml')
root = tree.getroot()
for child in root:
    print child.tag, child.attrib
    for child2 in child:
        print child2.tag, child2.attrib
        for child3 in child2:
            print child3.tag, child3.attrib
            for child4 in child3:
                print child4.tag, child4.attrib
                for child5 in child4:
                    print child5.tag, child5.attrib

This is part of the output that I get for PO_NUMBER.

{http://schemas.microsoft.com/ado/2007/08/dataservices}PO_NUMBER {'{http://schemas.microsoft.com/ado/2007/08/dataservices/metadata}type': 'Edm.String'}

I'm not able to get the value of PO_NUMBER which is 96372537-120. How do I get this value, and the other values as highlighted above?

1 Answer 1

3

In ElementTree, an element's (leading) text node is set on the text attribute. tag is the name of the XML tag (in Clark's notation) and attrib are the XML attributes only (also in Clark's notation).

So child5.text will give you the information you need.

Incidentally, you can use Clark's notation {namespace}tag with ElementTree's regular querying API to access the content or properties element directly, you don't have to iterate everything by hand:

tree.iter('{http://schemas.microsoft.com/ado/2007/08/dataservices/metadata}properties')

will give you an iterator on all the "properties" objects in the tree, and then you can just iterate on each property and get the corresponding child's text:

for child in property:
    print(child.text)

Note an oddity for mixed content (when an element can have both text and element children): in the ElementTree document model, only first child is set on .text when it's a text node, otherwise it's set as .tail on the preceding element e.g.

<foo>
    bar
    <qux/>
    baz
</foo>

will have foo.text == "bar" but "baz" will be set on qux.tail.

Sign up to request clarification or add additional context in comments.

4 Comments

got it by using child4.text. Thank you so much for this explanation! Really appreciate it!
tree.iter('{http://schemas.microsoft.com/ado/2007/08/dataservices}*') will also do.
@Tomalak Indeed but I assumed they may have wanted to group/segregate values by property.
Agreed, this only works when there is no more than one <m:properties> element in the document.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.