Tree-sitter Markdown Grammar

0.5.3 · active · verified Sun Apr 12

tree-sitter-markdown provides a robust and comprehensive Markdown grammar for the Tree-sitter parsing library. It enables high-performance syntactic parsing of Markdown content, exposing an abstract syntax tree (AST) for various applications like code highlighting, refactoring, and static analysis. The library is actively maintained, with frequent minor releases addressing grammar improvements and compatibility updates.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the Tree-sitter Markdown parser, parse a sample Markdown string, and inspect the resulting syntax tree using S-expressions and node access.

import tree_sitter
from tree_sitter import Parser
from tree_sitter_markdown import language

# Load the Tree-sitter Markdown language
# The `tree_sitter_markdown` package provides a pre-compiled language object.
MARKDOWN_LANGUAGE = language()

# Initialize a parser and set its language
parser = Parser()
parser.set_language(MARKDOWN_LANGUAGE)

# Parse a Markdown string
markdown_text = "## Hello Tree-sitter\n\nThis is a *paragraph* with **bold** text."
tree = parser.parse(bytes(markdown_text, "utf8"))

# Print the S-expression representation of the syntax tree
print("Parsed S-expression:")
print(tree.root_node.sexp())

# Example: Accessing a specific node
root_node = tree.root_node
# Assuming the first child is the heading based on the input markdown
if root_node.child_count > 0:
    heading_node = root_node.children[0]
    print(f"\nFirst node type: {heading_node.type}, text: {heading_node.text.decode('utf8')}")

view raw JSON →