Using Python to get XML values and tags

Question

I have an XML and part of it looks like this:

        <?xml version="1.0" encoding="UTF-8" ?>,
         <Settings>,
             <System>,
                 <Format>Percent</Format>,
                 <Time>12 Hour Format</Time>,
                 <Set>Standard</Set>,
             </System>,
             <System>,
                 <Format>Percent</Format>,
                 <Time>12 Hour Format</Time>,
                 <Set>Standard</Set>,
                 <Alarm>ON</Alarm>,
                 <Haptic>ON</Haptic>'
             </System>
          </Settings>

What I would like to do is use xpath to specify the path //Settings/System and get the tags and values in system so that I can populate a dataframe with the following output:

| Format | Time| Set| Alarm| Haptic|
|:_______|:____|:___|______|_______|
| Percent| 12 Hour Format| Standard| NaN| NaN|
| Percent| 12 Hour Format| Standard| ON| ON|

So far I have seen methods as follows:

import xml.etree.ElementTree as ET
root = ET.parse(filename)
result = ''

for elem in root.findall('.//child/grandchild'):
    # How to make decisions based on attributes even in 2.6:
    if elem.attrib.get('name') == 'foo':
        result = elem.text

These methods explicitly mention elem.attrib.get('name') which I would not be able to use in my case because of inconsistent elements within my /System tag. So what I am asking is if there is a method to use xpath (or anything else) which I can specify /System and get all elements and their values?

What do you mean by "inconsistent elements within my /System tag."? — Jack Fleeting
– Jack Fleeting, Commented Jul 8, 2021 at 21:53
@JackFleeting In the example I have three elements (Format, Time, Set), but in other xml files/strings, there may be 5 to 10 different elements — GK89
– GK89, Commented Jul 8, 2021 at 22:02
I see; and in those situations where you have, say, 5 elements, you are still interested only these specific three? — Jack Fleeting
– Jack Fleeting, Commented Jul 8, 2021 at 22:04
@JackFleeting No - I would want to be able to get all of them and then in a later step of code, concat them — GK89
– GK89, Commented Jul 8, 2021 at 22:08
To make sure I understand you, please edit your question with a well-formed xml containing two /System elements with different numbers of child elements (say, 3 in the first and 5 in the second) together with the expected output dataframe. — Jack Fleeting
– Jack Fleeting, Commented Jul 8, 2021 at 22:11

Jack Fleeting · Accepted Answer · 2021-07-08 22:45:51Z

Your xml is still not well formed, but assuming it's fixed and looks like the version before, the following should work:

#fixed xml
<?xml version="1.0" encoding="UTF-8" ?>
     <Settings>
         <System>
             <Format>Percent</Format>
             <Time>12 Hour Format</Time>
             <Set>Standard</Set>
         </System>
         <System>
             <Format>Percent</Format>
             <Time>12 Hour Format</Time>
             <Set>Standard</Set>
             <Alarm>ON</Alarm>
             <Haptic>ON</Haptic>
             </System>
     </Settings>

Now for the code itself:

import pandas as pd
rows, tags = [], []
#get all unique element names
for elem in root.findall('System//*'):
    if elem.tag not in tags:
        tags.append(elem.tag)
#now collect the required info:
for elem in root.findall('System'):
    rows.append([elem.find(tag).text if elem.find(tag) is not None else None  for tag in tags ])
pd.DataFrame(rows,columns=tags)

Output:

    Format  Time    Set     Alarm   Haptic
0   Percent     12 Hour Format  Standard    None    None
1   Percent     12 Hour Format  Standard    ON  ON

Collectives™ on Stack Overflow

Using Python to get XML values and tags

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related