Python lxml: how to validate file using xml schema file defined in the xml file

Question

I'm using Python lxml library to parse xml files. I need to validate a xml file against a xml schema. lxml supports XML schema validation, but the xml schema filepath/content must be provided (information available here: http://lxml.de/validation.html). However, I don't know the xml schema filepath beforehand, it should be parsed from the xml file header tags. I can't find a way to access these tags.

lxml supports this use case somehow?

Martijn Pieters · Accepted Answer · 2012-09-07 17:16:11Z

3

If the schema is linked using attributes on the root element, in the http://www.w3.org/2001/XMLSchema-instance namespace, you can retrieve these with lxml by prefixing the attribute name with the namespace URL in curly braces:

XMLSchemaNamespace = '{http://www.w3.org/2001/XMLSchema-instance}'
document = lxml.parse(xmlfile)
schemaLink = document.get(XMLSchemaNamespace + 'schemaLocation')
if schemaLink is None:
    schemaLink = document.get(XMLSchemaNamespace + 'noNamespaceSchemaLocation')

and then use a URL library to load the schema from the location referred. See lxml namespace handling for more information.

answered Sep 7, 2012 at 17:16

Martijn Pieters

1.1m326 gold badges4.2k silver badges3.4k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python lxml: how to validate file using xml schema file defined in the xml file

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related