I have some XML file i want to validate and i have to do it with Python. I've tryed to validate it with XSD with lxml. But i get only one error which occurs first but i need all errors and mismatches in XML file. Is there any method how i can manage to get list of all errors with lxml? Or are there any other Python solutions?
6 Answers
The way to solve this problem was:
try: xmlschema.assertValid(xml_to_validate) except etree.DocumentInvalid, xml_errors: pass print "List of errors:\r\n", xml_errors.error_log
May be there are better ways to solve this problem :)
1 Comment
xml_errors, where does this come from?With lxml, you can iterate over error_log and print line number and error message for each error:
def validate_with_lxml(xsd_tree, xml_tree):
schema = lxml.etree.XMLSchema(xsd_tree)
try:
schema.assertValid(xml_tree)
except lxml.etree.DocumentInvalid:
print("Validation error(s):")
for error in schema.error_log:
print(" Line {}: {}".format(error.line, error.message))
Comments
Using XMLSchema library you can resort to the iter_errors method: https://xmlschema.readthedocs.io/en/latest/api.html?highlight=iter_errors#xmlschema.XMLSchemaBase.iter_errors
here my code (Python 3):
def get_validation_errors(xml_file, xsd_uri='example.xsd'):
xml_string = xml_file.read().decode('utf-8')
dir_path = os.path.dirname(os.path.realpath(__file__))
xsd_path = os.path.join(dir_path, xsd_uri)
schema = xmlschema.XMLSchema(xsd_path)
validation_error_iterator = schema.iter_errors(xml_string)
for idx, validation_error in enumerate(validation_error_iterator, start=1):
print(f'[{idx}] path: {validation_error.path} | reason: {validation_error.reason}')
Comments
You can
read the Python docs on Error exceptions to find out how to do a try/except design pattern, and in the except part you can store the mismatches (a list, or a list of pairs, for example), then recursively try/except again beginning at the point after the first error. Let me know if that isn't descriptive enough.
Cheers!
Comments
The solution provided doesn't work anymore due to several errors. Before applying assertValid on xmlschema, you have to specify it as follows:
try:
xmlschema_doc = lxml.etree.parse(filename)
xmlschema = lxml.etree.XMLSchema(xmlschema_doc)
xmlschema.assertValid(elem_tree)
except lxml.etree.ParseError as e:
raise lxml.etree.ParserError(str(e))
except lxml.etree.DocumentInvalid as e:
if ignore_errors:
raise lxml.etree.ValidationError("The Config tree is invalid with error
message:\n\n" + str(e))
1 Comment
def get_validation_errors(xml_file, xsd_file):
schema = xmlschema.XMLSchema(xsd_file)
validation_error_iterator = schema.iter_errors(xml_file)
errors = list()
for idx, validation_error in enumerate(validation_error_iterator, start=1):
errors.append(validation_error.__str__())
print(
f'[{idx}] path: {validation_error.path} | reason: {validation_error.reason} | message: {validation_error.message}')
return errors