1

I am trying to apply an XPath query to XML data which has namespaces using the following code:

from lxml import etree
from io import StringIO
    
xml = '''
    <gpx creator="udos" version="1.1" xmlns="http://www.topografix.com/GPX/1/1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd http://www.garmin.com/xmlschemas/GpxExtensions/v3 http://www.garmin.com/xmlschemas/GpxExtensionsv3.xsd http://www.garmin.com/xmlschemas/TrackPointExtension/v1 http://www.garmin.com/xmlschemas/TrackPointExtensionv1.xsd" xmlns:gpxtpx="http://www.garmin.com/xmlschemas/TrackPointExtension/v1" xmlns:gpxx="http://www.garmin.com/xmlschemas/GpxExtensions/v3">
     <metadata>
      <time>2015-07-07T15:16:40Z</time>
     </metadata>
     <trk>
      <name>some name</name>
      <trkseg>
       <trkpt lat="46.3884140" lon="10.0286290">
        <ele>2261.8</ele>
        <time>2015-07-07T15:30:42Z</time>
       </trkpt>
       <trkpt lat="46.3884050" lon="10.0286240">
        <ele>2261.6</ele>
        <time>2015-07-07T15:30:43Z</time>
       </trkpt>
       <trkpt lat="46.3884000" lon="10.0286210">
        <ele>2262.0</ele>
        <time>2015-07-07T15:30:46Z</time>
       </trkpt>
       <trkpt lat="46.3884000" lon="10.0286210">
        <ele>2261.8</ele>
        <time>2015-07-07T15:30:47Z</time>
       </trkpt>
      </trkseg>
     </trk>
    </gpx>
    '''
    
# this is to simulate that above xml was read from a file
file = StringIO(unicode(xml))   # with python 3 use "file = StringIO(xml)"
    
# reading the xml from a file
tree = etree.parse(file)
    
ns = {'xmlns': 'http://www.topografix.com/GPX/1/1',
      'xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance',
      'xmlns:gpxtpx': 'http://www.garmin.com/xmlschemas/TrackPointExtension/v1',
      'xmlns:gpxx': 'http://www.garmin.com/xmlschemas/GpxExtensions/v3'}
    
expr = 'trk/trkseg/trkpt/ele'
    
for element in tree.xpath(expr, namespaces=ns):
    print(element.text)

I expect the following output from the code:

2261.8
2261.6
2262.0
2261.8

when you substitute the XML root element

<gpx creator="udos" version="1.1" xmlns="http://www.topografix.com/GPX/1/1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd http://www.garmin.com/xmlschemas/GpxExtensions/v3 http://www.garmin.com/xmlschemas/GpxExtensionsv3.xsd http://www.garmin.com/xmlschemas/TrackPointExtension/v1 http://www.garmin.com/xmlschemas/TrackPointExtensionv1.xsd" xmlns:gpxtpx="http://www.garmin.com/xmlschemas/TrackPointExtension/v1" xmlns:gpxx="http://www.garmin.com/xmlschemas/GpxExtensions/v3">

with

<gpx>

the code is working...

any suggestions how to get it to work with namespaces as well?

0

1 Answer 1

3

You can define your namespaces as -

ns = {'n': 'http://www.topografix.com/GPX/1/1',
      'xsi': 'http://www.w3.org/2001/XMLSchema-instance',
      'gpxtpx': 'http://www.garmin.com/xmlschemas/TrackPointExtension/v1',
      'gpxx': 'http://www.garmin.com/xmlschemas/GpxExtensions/v3'}

This would define the prefix for 'http://www.topografix.com/GPX/1/1' as n , and then in your XPath query, you can use that prefix. Example -

expr = 'n:trk/n:trkseg/n:trkpt/n:ele'

for element in tree.xpath(expr, namespaces=ns):
        print(element.text)

This is because the xmlns for the root node is - 'http://www.topografix.com/GPX/1/1' - hence all the child nodes automatically inherit that as the xmlns (namespace) , unless the child node uses a different prefix or specifies an namespace of its own.

Example/Demo -

In [44]: ns = {'n': 'http://www.topografix.com/GPX/1/1',
   ....:       'xsi': 'http://www.w3.org/2001/XMLSchema-instance',
   ....:       'gpxtpx': 'http://www.garmin.com/xmlschemas/TrackPointExtension/v1',
   ....:       'gpxx': 'http://www.garmin.com/xmlschemas/GpxExtensions/v3'}

In [45]:

In [45]: expr = 'n:trk/n:trkseg/n:trkpt/n:ele'

In [46]: for element in tree.xpath(expr, namespaces=ns):
   ....:         print(element.text)
   ....:
2261.8
2261.6
2262.0
2261.8
Sign up to request clarification or add additional context in comments.

1 Comment

which means that I messed up the xpath query. instead of using expr = 'trk/trkseg/trkpt/ele' I should have used expr = 'xmlns:trk/xmlns:trkseg/xmlns:trkpt/xmlns:ele' to consider the "blank" namespace.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.