{"id":6948,"library":"xmldiff","title":"XML Diff Utility","description":"xmldiff is a Python library and command-line utility designed to create semantic differences between XML files. Unlike traditional line-by-line diff tools, it focuses on identifying structural and content changes in hierarchical XML data, often generating human-readable diffs. It is currently at version 2.7.0 and is under active development, though the library explicitly states that output formats and edit scripts might change between versions due to ongoing improvements.","status":"active","version":"2.7.0","language":"en","source_language":"en","source_url":"https://github.com/Shoobx/xmldiff","tags":["xml","diff","comparison","utility","lxml","semantic diff"],"install":[{"cmd":"pip install xmldiff","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Required for parsing and manipulating XML trees, central to xmldiff's functionality.","package":"lxml"},{"reason":"Optional dependency for faster text comparisons in certain scenarios.","package":"diff-match-patch","optional":true}],"imports":[{"note":"Provides the primary diffing functions like `diff_files`, `diff_texts`, `diff_trees`, and `patch_file`.","symbol":"main","correct":"from xmldiff import main"},{"note":"Contains built-in formatters like `XMLFormatter` for controlling output style.","symbol":"formatting","correct":"from xmldiff import formatting"},{"note":"Essential for creating and manipulating XML tree objects when using `diff_trees` or processing formatter output.","symbol":"etree","correct":"from lxml import etree"}],"quickstart":{"code":"import os\nfrom lxml import etree\nfrom xmldiff import main, formatting\n\n# Create dummy XML files\nxml1_content = \"\"\"\n<root>\n    <item id=\"1\">Value A</item>\n    <item id=\"2\">Value B</item>\n</root>\n\"\"\"\nxml2_content = \"\"\"\n<root>\n    <item id=\"1\">Value A - Changed</item>\n    <item id=\"3\">Value C</item>\n    <item id=\"2\" status=\"new\">Value B</item>\n</root>\n\"\"\"\n\nwith open('file1.xml', 'w') as f:\n    f.write(xml1_content)\nwith open('file2.xml', 'w') as f:\n    f.write(xml2_content)\n\n# Diff two XML files and format the output as XML with diff tags\ndiff_output_xml = main.diff_files(\n    'file1.xml',\n    'file2.xml',\n    formatter=formatting.XMLFormatter(pretty_print=True)\n)\n\nprint(\"--- XML Diff ---\")\nprint(diff_output_xml)\n\n# Clean up dummy files\nos.remove('file1.xml')\nos.remove('file2.xml')\n\n# Example using diff_trees with lxml elements directly\ntree1 = etree.fromstring(xml1_content)\ntree2 = etree.fromstring(xml2_content)\n\ndiff_actions = main.diff_trees(tree1, tree2)\nprint(\"\\n--- Edit Script (List of Actions) ---\")\nfor action in diff_actions:\n    print(action)\n","lang":"python","description":"This quickstart demonstrates how to use `xmldiff` to compare two XML files. It creates two temporary XML files, then calls `main.diff_files()` with `formatting.XMLFormatter` to get a human-readable XML output with differences marked. It also shows how to get the raw 'edit script' (list of actions) by calling `main.diff_trees()` with `lxml` Element objects directly."},"warnings":[{"fix":"Review the official documentation for xmldiff 2.x and rewrite code to use the new API, particularly `main.diff_files`, `diff_texts`, `diff_trees`, and `formatting` modules.","message":"xmldiff 2.0 introduced a complete, ground-up rewrite of the library. This change included a new API, different output formats, and was initially significantly slower than previous 0.x/1.x versions. Code written for 0.x/1.x is incompatible with 2.x and later.","severity":"breaking","affected_versions":"2.0.0 and later"},{"fix":"Ensure that any XML trees passed to `xmldiff` functions are `lxml.etree._Element` or `lxml.etree._ElementTree` objects. Convert from `xml.etree` if necessary, e.g., by parsing XML strings directly with `lxml.etree.fromstring` or `lxml.etree.parse`.","message":"The `xmldiff` library expects `lxml` ElementTree instances when using functions like `diff_trees()`. Passing standard `xml.etree.ElementTree` objects will result in errors or unexpected behavior, requiring conversion to `lxml` types first.","severity":"gotcha","affected_versions":"All 2.x versions"},{"fix":"When writing tests or parsers for `xmldiff` output, focus on the semantic correctness of the changes rather than exact string or action sequence matches. Consider flexible parsing or asserting on the presence/absence of expected changes rather than full output equality.","message":"The output (edit script or formatted XML) generated by `xmldiff` can change between minor versions. The library explicitly states that there are 'no guarantees' the output will be the same across versions, as it's under 'rapid development'. This means automated tests relying on exact output matches may break.","severity":"gotcha","affected_versions":"All 2.x versions"},{"fix":"Upgrade to `xmldiff` version 2.6.0 or higher for improved namespace handling. Avoid scenarios where a namespace prefix's URI is changed between the two XML documents; instead, ensure prefixes map to consistent URIs or handle such cases manually before diffing.","message":"Prior to version 2.6, `xmldiff` had limited or buggy handling of XML namespaces, potentially leading to 'Unknown namespace prefix' errors. While improved in 2.6, changing the URI of an existing namespace prefix is still not supported and will raise an error.","severity":"gotcha","affected_versions":"< 2.6.0 (Namespace bugs), All versions (Changing URI for prefix)"},{"fix":"Understand the trade-offs between performance and accuracy when selecting `ratio-mode` or `--fast-match`. For critical diffing where every change must be detected, use `accurate` mode. For large files where an approximate diff is sufficient, faster modes can be considered, but be aware of potential missed details.","message":"The `ratio-mode` (`accurate`, `faster`, `fast`) and `--fast-match` options can significantly impact the diff's accuracy and performance. The `fast` mode, in particular, yields less accurate results, which might be acceptable for speed but could miss subtle changes.","severity":"gotcha","affected_versions":"All 2.x versions"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z","problems":[]}