{"id":4402,"library":"tree-sitter-xml","title":"Tree-sitter XML & DTD Grammars","description":"tree-sitter-xml provides pre-compiled Tree-sitter grammars for XML and DTD. It enables fast, robust parsing of XML and DTD documents within Python applications by integrating with the `tree-sitter` library. The current version is 0.7.0, with updates typically coinciding with upstream Tree-sitter grammar improvements or core library changes.","status":"active","version":"0.7.0","language":"en","source_language":"en","source_url":"https://github.com/tree-sitter/tree-sitter-xml","tags":["tree-sitter","xml","dtd","parsing","grammar","syntax tree"],"install":[{"cmd":"pip install tree-sitter-xml","lang":"bash","label":"Install `tree-sitter-xml`"},{"cmd":"pip install tree-sitter","lang":"bash","label":"Install core `tree-sitter` library (required)"}],"dependencies":[{"reason":"This library provides grammars; `tree-sitter` provides the core parsing engine that utilizes them.","package":"tree-sitter","optional":false}],"imports":[{"note":"The `language()` function is directly exposed in the top-level package.","wrong":"from tree_sitter_xml.xml import language","symbol":"language","correct":"from tree_sitter_xml import language"},{"note":"The `dtd_language()` function is directly exposed in the top-level package for DTD grammar.","wrong":"from tree_sitter_xml.dtd import language","symbol":"dtd_language","correct":"from tree_sitter_xml import dtd_language"}],"quickstart":{"code":"import tree_sitter\nfrom tree_sitter_xml import language\n\n# Load the XML grammar\nXML_LANGUAGE = language()\n\n# Initialize the parser\nparser = tree_sitter.Parser()\nparser.set_language(XML_LANGUAGE)\n\n# Sample XML string\nxml_code = \"\"\"\n<root>\n  <item id=\"1\">Value 1</item>\n  <item id=\"2\">Value 2</item>\n</root>\n\"\"\"\n\n# Parse the XML\ntree = parser.parse(xml_code.encode('utf8'))\n\n# Print the S-expression (a common way to inspect the parse tree)\nprint(f\"Parsed XML Tree S-expression:\\n{tree.root_node.sexp()}\")\n\n# Example of traversing a node (e.g., finding the 'item' elements)\nroot_node = tree.root_node\nitem_nodes = [child for child in root_node.children if child.type == 'element' and child.text.decode('utf8').strip().startswith('<item')]\n\nprint(f\"\\nFound {len(item_nodes)} 'item' elements.\")\nif item_nodes:\n    print(f\"First item's text: {item_nodes[0].text.decode('utf8')}\")","lang":"python","description":"This quickstart demonstrates how to load the XML grammar using `tree_sitter_xml.language()` and then parse a simple XML string using `tree_sitter.Parser`. It also shows how to print the S-expression of the parse tree and find specific elements."},"warnings":[{"fix":"Always import `tree_sitter.Parser` and set the language on its instance: `parser = tree_sitter.Parser(); parser.set_language(tree_sitter_xml.language())`.","message":"`tree-sitter-xml` itself doesn't provide parsing functions directly. It exposes `language()` and `dtd_language()` to retrieve pre-compiled grammars, which must then be used with `tree_sitter.Parser` for actual parsing. You need both libraries.","severity":"gotcha","affected_versions":"All"},{"fix":"Always test your parsing logic after `tree-sitter` upgrades. Consult the `tree-sitter` and `tree-sitter-xml` release notes.","message":"Updates to the underlying `tree-sitter` library (especially major versions or breaking changes in its Python bindings) may introduce API changes that could affect how `tree-sitter-xml`'s grammars interact with the parser.","severity":"breaking","affected_versions":"Potentially any `tree-sitter` major version bump (e.g., 0.x to 0.y if `tree-sitter-xml` doesn't update concurrently)."},{"fix":"Ensure you're using the correct grammar function (`language()` or `dtd_language()`) based on the file type you intend to parse.","message":"XML and DTD grammars are distinct. Use `tree_sitter_xml.language()` for parsing XML documents and `tree_sitter_xml.dtd_language()` for parsing DTD files; they are not interchangeable.","severity":"gotcha","affected_versions":"All"},{"fix":"If parsing fails with 'Language not found' or similar errors despite correct imports, check platform compatibility. Ensure your `pip` environment matches the distributed binaries. For complex setups, consider building the grammar manually via `tree_sitter.Language.build_library` if provided by the grammar's source.","message":"Pre-compiled grammars (`.so`, `.dylib`, or `.dll` files) are platform-specific. While `tree-sitter-xml` aims to distribute compatible binaries, issues can arise in unusual environments (e.g., exotic OS, specific Python distributions) or if custom grammar compilation is attempted.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}