{"id":4812,"library":"tree-sitter-html","title":"Tree-sitter HTML Grammar","description":"tree-sitter-html provides the HTML grammar for the Tree-sitter parsing system. It enables parsing HTML code into a concrete syntax tree, facilitating static analysis, syntax highlighting, and code transformation. The library is actively developed with regular updates, currently at version 0.23.2.","status":"active","version":"0.23.2","language":"en","source_language":"en","source_url":"https://github.com/tree-sitter/tree-sitter-html","tags":["parsing","syntax-tree","html","grammar","tree-sitter"],"install":[{"cmd":"pip install tree-sitter tree-sitter-html","lang":"bash","label":"Install with pip"}],"dependencies":[{"reason":"This package provides the HTML grammar; the core `tree-sitter` Python library is required for parsing functionality and Python bindings.","package":"tree-sitter"}],"imports":[{"note":"`tree_sitter_html` provides the grammar object via `tree_sitter_html.language()`, while `Language` and `Parser` classes come from the core `tree_sitter` library.","wrong":"from tree_sitter_html import Language, Parser","symbol":"Language, Parser","correct":"from tree_sitter import Language, Parser\nimport tree_sitter_html"}],"quickstart":{"code":"import tree_sitter\nimport tree_sitter_html\n\n# Load the HTML language grammar\nHTML_LANGUAGE = tree_sitter.Language(tree_sitter_html.language())\n\n# Initialize the parser and set the language\nparser = tree_sitter.Parser()\nparser.set_language(HTML_LANGUAGE)\n\n# Sample HTML code (must be bytes for tree-sitter)\nhtml_code = b\"<!DOCTYPE html>\\n<html><body><h1>Hello, Tree-sitter!</h1></body></html>\"\n\n# Parse the code\ntree = parser.parse(html_code)\n\n# Get the root node of the syntax tree\nroot_node = tree.root_node\n\n# Print a S-expression representation of the tree\nprint(root_node.sexp())","lang":"python","description":"This quickstart demonstrates how to initialize a `tree-sitter` parser with the `tree-sitter-html` grammar and parse a basic HTML string into a syntax tree, then print its S-expression representation."},"warnings":[{"fix":"Ensure both `tree-sitter` and `tree-sitter-html` are installed: `pip install tree-sitter tree-sitter-html`.","message":"The `tree-sitter` core library is a mandatory dependency and must be installed alongside `tree-sitter-html` for the Python bindings to function.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Thoroughly test parsing results with your specific HTML structures after updating the grammar. Refer to GitHub issues for known parsing quirks.","message":"Grammar updates can lead to changes in the generated Abstract Syntax Tree (AST), potentially affecting existing queries or code analysis logic. Common issues include unexpected handling of whitespace, attribute values, or void elements.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Install all relevant language grammars for files containing multiple embedded languages: `pip install tree-sitter-html tree-sitter-javascript tree-sitter-css`.","message":"When `tree-sitter-html` is used for parsing HTML embedded within other languages (e.g., PHP, JavaScript in `script` tags, CSS in `style` tags), ensure that the corresponding `tree-sitter` grammars (e.g., `tree-sitter-javascript`, `tree-sitter-css`) are also installed for complete and accurate highlighting/parsing of injected languages.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Instead of `Language.build_library()`, directly pass the grammar's `language()` function to `tree_sitter.Language()`: `HTML_LANGUAGE = tree_sitter.Language(tree_sitter_html.language())`.","message":"Older examples might show `tree_sitter.Language.build_library()` for loading grammars. This method is deprecated for pre-compiled Python wheels (like `tree-sitter-html`).","severity":"deprecated","affected_versions":"py-tree-sitter versions ~0.21.x and later"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}