2

Given the xml

xmlstr = '''
<myxml>
    <Description id="10">
      <child info="myurl"/>
    </Description>
</myxml>'

I'd like to get the id of Description only where child has an attribute of info.

import xml.etree.ElementTree as ET
root = ET.fromstring(xmlstr)
a = root.find(".//Description/[child/@info]")
print(a.attrib)

and changing the find to .//Description/[child[@info]]

both return an error of:

SyntaxError: invalid predicate

I know that etree only supports a subset of xpath, but this doesn't seem particularly weird - should this work? If so, what have I done wrong?!

Changing the find to .//Description/[child] does work, and returns

{'id': '10'}

as expected

2 Answers 2

1

You've definitely hit that XPath limited support limitation as, if we look at the source directly (looking at 3.7 source code), we could see that while parsing the Element Path expression, only these things in the filters are considered:

  • [@attribute] predicate
  • [@attribute='value']
  • [tag]
  • [.='value'] or [tag='value']
  • [index] or [last()] or [last()-index]

Which means that both of your rather simple expressions are not supported.


If you really want/need to stick with the built-in ElementTree library, one way to solve this would be with finding all Description tags via .findall() and filtering the one having a child element with info attribute.

Sign up to request clarification or add additional context in comments.

4 Comments

I don't need to stick with ElementTree - I'd tried lxml too, but found it slightly less intuitive to use...
@ChrisW yeah, with lxml, your expression .//Description[child/@info]/@id would work as-is. There is also BeautifulSoup, which could potentially be a more intuitive option. Check it out. Thanks.
Yep, even wanting a specific value for the attribute works with lxml: root.xpath("//Description[child/@info='myurl']") :)
@ChrisW it is definitely a powerful tool. And, it is blazingly fast!
0

You can also get those values as keys, which makes it a bit more structured approach to gather data:

import xml.etree.ElementTree as ET
root = ET.fromstring(xmlstr)
wht =root.find(".//Description") 
wht.keys() #--> ['id']
wht.get('id')  # --> '10'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.