0

I'm trying to parse this XML using Element Tree in the latest version of python. What I'd like to do is count the number of APPINFO elements and then get the data out of the latest instance of APPINFO (the last one in the tree). So far I am able to get the number of APPINFO elements using

count = len(root.findall("./APPINFO"))

But how do I reference only the last one in the tree and extract the values?

<APPLICANT>
<APPINFO>
    <FIRSTNAME>Joe</FIRSTNAME>
    <LASTNAME>Smith</LASTNAME>
    <MIDDLENAME></MIDDLENAME>
    <OTHERNAME></OTHERNAME>
</APPINFO>
<APPLICANT>
<APPINFO>
    <FIRSTNAME>Peter</FIRSTNAME>
    <LASTNAME>Smith</LASTNAME>
    <MIDDLENAME></MIDDLENAME>
    <OTHERNAME></OTHERNAME>
</APPINFO>
<APPINFO> #I need the data out of this one only
    <FIRSTNAME>John</FIRSTNAME>
    <LASTNAME>Smith</LASTNAME>
    <MIDDLENAME></MIDDLENAME>
    <OTHERNAME></OTHERNAME>
</APPINFO>

1
  • 1
    last=root.findall("./APPINFO")[-1] Commented Sep 16, 2017 at 15:40

2 Answers 2

1

Working example to count and access last element. when working with lists, negative indices access elements from the end of the list.

from xml.etree import ElementTree as et

data = '''\
<APPLICANT>
  <APPINFO>
    <FIRSTNAME>Joe</FIRSTNAME>
    <LASTNAME>Smith</LASTNAME>
    <MIDDLENAME></MIDDLENAME>
    <OTHERNAME></OTHERNAME>
  </APPINFO>
  <APPINFO>
    <FIRSTNAME>Peter</FIRSTNAME>
    <LASTNAME>Smith</LASTNAME>
    <MIDDLENAME></MIDDLENAME>
    <OTHERNAME></OTHERNAME>
  </APPINFO>
  <APPINFO>
    <FIRSTNAME>John</FIRSTNAME>
    <LASTNAME>Smith</LASTNAME>
    <MIDDLENAME></MIDDLENAME>
    <OTHERNAME></OTHERNAME>
  </APPINFO>
</APPLICANT>'''

tree = et.fromstring(data)
appinfo = tree.findall("./APPINFO")
print(len(appinfo))
et.dump(appinfo[-1])
print(appinfo[-1].find('FIRSTNAME').text)

Output:

3
<APPINFO>
    <FIRSTNAME>John</FIRSTNAME>
    <LASTNAME>Smith</LASTNAME>
    <MIDDLENAME />
    <OTHERNAME />
  </APPINFO>
John
Sign up to request clarification or add additional context in comments.

2 Comments

Suppose I have <DATA> <EMPLOYMENT> <TERMS></TERMS> <EMP_NAME1>TEST1</EMP_NAME1> </EMPLOYMENT> <EMPLOYMENT> <TERMS></TERMS> <EMP_NAME1>TEST2</EMP_NAME1> </EMPLOYMENT> </DATA> <DATA> <EMPLOYMENT> <TERMS></TERMS> <EMP_NAME1>TEST2</EMP_NAME1> </EMPLOYMENT> <EMPLOYMENT> <TERMS></TERMS> <EMP_NAME1>TEST4</EMP_NAME1> </EMPLOYMENT> </DATA> How do I pull the values out of each employment element in the latest instance of DATA?
@AlexanderWaldbaum Locate the DATA elements and get the last one similar to the above example, then iterate over data.findall('EMPLOYMENT'). If you need more detail, ask another question.
0
allAppInfo=root.findall("./APPINFO")

The above returns a list of elements.

count=len(allAppInfo)

The above returns the count of the elements inside the list allAppInfo

last=allAppInfo[count-1]

The above returns the last element in the list which is the element at index count-1.

last=allAppInfo[-1]

The above also returns the last element in the list which is at index -1 from the last.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.