XML Diff Utility

2.7.0 · active · verified Wed Apr 15

xmldiff is a Python library and command-line utility designed to create semantic differences between XML files. Unlike traditional line-by-line diff tools, it focuses on identifying structural and content changes in hierarchical XML data, often generating human-readable diffs. It is currently at version 2.7.0 and is under active development, though the library explicitly states that output formats and edit scripts might change between versions due to ongoing improvements.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `xmldiff` to compare two XML files. It creates two temporary XML files, then calls `main.diff_files()` with `formatting.XMLFormatter` to get a human-readable XML output with differences marked. It also shows how to get the raw 'edit script' (list of actions) by calling `main.diff_trees()` with `lxml` Element objects directly.

import os
from lxml import etree
from xmldiff import main, formatting

# Create dummy XML files
xml1_content = """
<root>
    <item id="1">Value A</item>
    <item id="2">Value B</item>
</root>
"""
xml2_content = """
<root>
    <item id="1">Value A - Changed</item>
    <item id="3">Value C</item>
    <item id="2" status="new">Value B</item>
</root>
"""

with open('file1.xml', 'w') as f:
    f.write(xml1_content)
with open('file2.xml', 'w') as f:
    f.write(xml2_content)

# Diff two XML files and format the output as XML with diff tags
diff_output_xml = main.diff_files(
    'file1.xml',
    'file2.xml',
    formatter=formatting.XMLFormatter(pretty_print=True)
)

print("--- XML Diff ---")
print(diff_output_xml)

# Clean up dummy files
os.remove('file1.xml')
os.remove('file2.xml')

# Example using diff_trees with lxml elements directly
tree1 = etree.fromstring(xml1_content)
tree2 = etree.fromstring(xml2_content)

diff_actions = main.diff_trees(tree1, tree2)
print("\n--- Edit Script (List of Actions) ---")
for action in diff_actions:
    print(action)

view raw JSON →