2

hi i have xml file whitch i want to parse, it looks something like this

<?xml version="1.0" encoding="utf-8"?>
<SHOP xmlns="http://www.w3.org/1999/xhtml" xmlns:php="http://php.net/xsl">
    <SHOPITEM>
        <ID>2332</ID>
        ...
    </SHOPITEM>
    <SHOPITEM>
        <ID>4433</ID>
        ...
    </SHOPITEM>
</SHOP>

my parsing code is

from lxml import etree

ifile = open('sample-file.xml', 'r')
file_data = etree.parse(ifile)

for item in file_data.iter('SHOPITEM'):
   print item

but item is print only when xml container

<SHOP xmlns="http://www.w3.org/1999/xhtml" xmlns:php="http://php.net/xsl">

looks like

<SHOP>

how can i parse xml document without worrying about this container definition?

1 Answer 1

3

See here for an explanation of how lxml.etree handles namespaces. In general, you should work with them rather than try to avoid them. In this case, write:

for item in file_data.iter('{http://www.w3.org/1999/xhtml}SHOPITEM'):

If you need to refer the namespace frequently, setup a local variable:

xhtml_ns = '{http://www.w3.org/1999/xhtml}'
...
for item in file_data.iter(xhtml_ns + 'SHOPITEM'):
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.