xPath with ElementTree (python) to parse XML from string

Question

I'm using ElementTree to parse some XML retrieved from a website, but somehow I can't see to be able to use ".find" or ".findall". I tried to use ElementTree, and I tired lxml.etree and nothing is working with me. My goal is to retrieve //course from my XML file retrieved from a URL.

import requests
import xml.etree.ElementTree as ET
res = requests.get(COURSES_URL).text #Storing the XML into res
XML = ET.fromstring(res)
print(XML.findall('//COURSE'))

COURSES_URL is my own URL which I am retrieving the XML from, and yes it is working since I got the output XML that I want (sample):

<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated by Oracle Reports version 11.1.2.1.0 -->
<SYRSPOS_REP>
  <LIST_G_PROGRAM>
    <G_PROGRAM>
      <SPRIDEN_ID>U712214</SPRIDEN_ID>
      <STUDENT_NAME>Mark Adam Johns</STUDENT_NAME>
      <SMBPOGN_PIDM>98</SMBPOGN_PIDM>
      <SMBPOGN_REQUEST_NO>46</SMBPOGN_REQUEST_NO>
      <COURSE ID=1411001>PASS</COURSE>
      <COURSE ID=1411023>PASS</COURSE>
      <COURSE ID=1411136>PASS</COURSE>
    </G_PROGRAM>
  </LIST_G_PROGRAM>
</SYRSPOS_REP>

Please post a minimal but complete sample of the XML i.e make sure it includes the target element <course> and make sure the entire XML is well-formed (single root element, closing tag, etc.) — har07
– har07, Commented Feb 18, 2018 at 3:55
In addition to the previous comment, and taking into account that all tags of your small sample are upper case: did you try "//COURSE"? — Markus
– Markus, Commented Feb 18, 2018 at 16:01
Could it be that you are not paying attention to XML namespaces? — Tomalak
– Tomalak, Commented Feb 18, 2018 at 18:48
@thethiny After adding double quotes on the XML attributes and a . at the beginning of the XPath, it worked: eval.in/958573 . As tomalak suggests, probably there are XML namespaces in the actual XML ? — har07
– har07, Commented Feb 19, 2018 at 3:34

thethiny · Accepted Answer · 2020-10-28 07:04:14Z

2

Solved: Apparently I had 2 issues. First of all I can't use findall in print since it returns a list, I had to do a for in loop for i in XML.findall(), then I print i.text(). Secondly, I had to add a dot after the quotation mark, as in ".//COURSES"

edited Oct 28, 2020 at 7:04

answered Feb 19, 2018 at 21:35

thethiny

1,2562 gold badges13 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

xPath with ElementTree (python) to parse XML from string

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related