1

I try to parse a huge file. The sample is below. I try to take <Name>, but I can't It works only without this string

<LevelLayout xmlns="http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">

 

xml2 = '''<?xml version="1.0" encoding="UTF-8"?>
<PackageLevelLayout>
<LevelLayouts>
    <LevelLayout levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432">
                <LevelLayout xmlns="http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
                    <LevelLayoutSectionBase>
                        <LevelLayoutItemBase>
                            <Name>Tracking ID</Name>
                        </LevelLayoutItemBase>
                    </LevelLayoutSectionBase>
                </LevelLayout>
            </LevelLayout>
    </LevelLayouts>
</PackageLevelLayout>'''

from lxml import etree
tree = etree.XML(xml2)
nodes = tree.xpath('/PackageLevelLayout/LevelLayouts/LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]/LevelLayout/LevelLayoutSectionBase/LevelLayoutItemBase/Name')
print nodes

2 Answers 2

3

Your nested LevelLayout XML document uses a namespace. I'd use:

tree.xpath('.//LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]//*[local-name()="Name"]')

to match the Name element with a shorter XPath expression (ignoring the namespace altogether).

The alternative is to use a prefix-to-namespace mapping and use those on your tags:

nsmap = {'acd': 'http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain'}

tree.xpath('/PackageLevelLayout/LevelLayouts/LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]/acd:LevelLayout/acd:LevelLayoutSectionBase/acd:LevelLayoutItemBase/acd:Name',
    namespaces=nsmap)
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much! It seems like I need to learn xpath deeper.
0

lxml's xpath method has a namespaces parameter. You can pass it a dict mapping namespace prefixes to namespaces. Then you can refer build XPaths that use the namespace prefix:

xml2 = '''<?xml version="1.0" encoding="UTF-8"?>
<PackageLevelLayout>
<LevelLayouts>
    <LevelLayout levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432">
                <LevelLayout xmlns="http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
                    <LevelLayoutSectionBase>
                        <LevelLayoutItemBase>
                            <Name>Tracking ID</Name>
                        </LevelLayoutItemBase>
                    </LevelLayoutSectionBase>
                </LevelLayout>
            </LevelLayout>
    </LevelLayouts>
</PackageLevelLayout>'''

namespaces={'ns': 'http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain',
            'i': 'http://www.w3.org/2001/XMLSchema-instance'}

import lxml.etree as ET
# This is an lxml.etree._Element, not a tree, so don't call it tree
root = ET.XML(xml2)

nodes = root.xpath(
    '''/PackageLevelLayout/LevelLayouts/LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]
       /ns:LevelLayout/ns:LevelLayoutSectionBase/ns:LevelLayoutItemBase/ns:Name''', namespaces = namespaces)
print nodes

yields

[<Element {http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain}Name at 0xb74974dc>]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.