I have an XML file with the following structure:
<Thread THREAD_SEQUENCE="Q268_R16">
<RelQuestion RELQ_ID="Q268_R16">
<RelQSubject>Best Bank.</RelQSubject>
<RelQBody>Hi ti all QL's; What bank you are using? and why? Are you using this bank just because it has an affiliate at home? Regards;</RelQBody>
</RelQuestion>
</Thread>
In the XML file, there are 244 RelQBody tags. What I want to do is getting the text inside the RelQBody tag. I have tried something like this:
import xml.dom.minidom
dom = xml.dom.minidom.parse("test.xml")
data = dom.documentElement
question = data.getElementsByTagName("RelQBody")
i=1
for q in question:
print("%i. %s" % (i, q.childNodes[0].data))
i = i+1
But i keep getting an error saying
Traceback (most recent call last):
File "C:\Users\Administrator\Documents\python\test.py", line 13, in <module>
print("%i. %s" % (i, q.childNodes[0].data))
IndexError: list index out of range
However, when i tried this code:
import xml.dom.minidom
dom = xml.dom.minidom.parse("test.xml")
data = dom.documentElement
question = data.getElementsByTagName("RelQBody")
i=1
for q in question:
print("%i" % i)
i = i+1
i got number 1-244. it is exactly the same as in the dataset.
So why there's a difference when i print out with the string and without the string? Maybe someone can tell me which part did i do wrong? I'm new to Python so any help will be appreciated. Thanks.