Tree-sitter HTML Grammar

0.23.2 · active · verified Sun Apr 12

tree-sitter-html provides the HTML grammar for the Tree-sitter parsing system. It enables parsing HTML code into a concrete syntax tree, facilitating static analysis, syntax highlighting, and code transformation. The library is actively developed with regular updates, currently at version 0.23.2.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize a `tree-sitter` parser with the `tree-sitter-html` grammar and parse a basic HTML string into a syntax tree, then print its S-expression representation.

import tree_sitter
import tree_sitter_html

# Load the HTML language grammar
HTML_LANGUAGE = tree_sitter.Language(tree_sitter_html.language())

# Initialize the parser and set the language
parser = tree_sitter.Parser()
parser.set_language(HTML_LANGUAGE)

# Sample HTML code (must be bytes for tree-sitter)
html_code = b"<!DOCTYPE html>\n<html><body><h1>Hello, Tree-sitter!</h1></body></html>"

# Parse the code
tree = parser.parse(html_code)

# Get the root node of the syntax tree
root_node = tree.root_node

# Print a S-expression representation of the tree
print(root_node.sexp())

view raw JSON →