XML Schema Validator and Decoder

4.3.1 · active · verified Thu Apr 09

xmlschema is a Python library for validating XML documents against XML Schema Definition (XSD) files and for decoding XML data into Python dictionaries. It supports XPath 1.0/2.0+ for schema processing and data extraction. The library is actively maintained with regular major and minor releases, typically several per year.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to load an XML Schema, validate an XML document against it, and decode the validated XML into a Python dictionary. It also shows an example of how invalid XML is handled during validation.

import xmlschema

# Define a simple XML Schema (XSD) and an XML document
xsd_content = '''
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="data">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="item" type="xs:string" maxOccurs="unbounded"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>
'''

xml_content = '''
<data>
  <item>First</item>
  <item>Second</item>
</data>
'''

# 1. Load the XML Schema
try:
    schema = xmlschema.XMLSchema(xsd_content)
    print("Schema loaded successfully.")
except xmlschema.XMLSchemaException as e:
    print(f"Schema error: {e}")
    exit(1)

# 2. Validate an XML document against the schema
try:
    schema.validate(xml_content)
    print("XML document is valid.")
except xmlschema.ValidationError as e:
    print(f"XML validation error: {e}")

# 3. Decode the XML document into a Python dictionary
try:
    data_dict = schema.to_dict(xml_content)
    print("\nDecoded XML to dictionary:")
    print(data_dict)
except xmlschema.ValidationError as e:
    print(f"Decoding error: {e}")

# Example of invalid XML
invalid_xml_content = '''
<data>
  <extra_item>This should not be here</extra_item>
  <item>Valid Item</item>
</data>
'''
try:
    schema.validate(invalid_xml_content)
    print("Invalid XML document is valid (ERROR!).")
except xmlschema.ValidationError as e:
    print(f"Invalid XML correctly failed validation: {e.reason.splitlines()[0]}...")

view raw JSON →