Tree-sitter Embedded Template Grammar
tree-sitter-embedded-template provides a Tree-sitter parser for templating languages such as ERB (Embedded Ruby) and EJS (Embedded JavaScript). It allows for robust parsing of files where scripting code is intertwined with text content using delimiters like `<%` and `%>`. The library is currently at version 0.25.0 and is an actively maintained part of the broader Tree-sitter ecosystem.
Warnings
- breaking The `Language.build_library` function for compiling grammars from source was removed around `py-tree-sitter` version 0.21.x. Grammars are now primarily consumed via pre-compiled `pip install`able packages.
- gotcha Grammar updates, particularly minor and major versions, can introduce changes to the Abstract Syntax Tree (AST) node types or structure. This can break existing Tree-sitter queries that rely on specific node names or hierarchies.
- breaking The behavior of `iter_matches` in the core `tree-sitter` library was corrected, which was a breaking change for code relying on its previously incorrect output. An opt-out flag for the old behavior was provided but has since been removed.
- gotcha The `tree-sitter` ecosystem shifted from using `package.json` to `tree-sitter.json` for grammar metadata around October 2024. While this primarily affects grammar maintainers and build systems, it can cause issues if you're attempting to build or integrate the grammar from source using older tooling.
Install
-
pip install tree-sitter tree-sitter-embedded-template
Imports
- Language
from tree_sitter import Language
- Parser
from tree_sitter import Parser
- language
import tree_sitter_embedded_template as ts_embedded_template ... ts_embedded_template.language()
Quickstart
import os
from tree_sitter import Language, Parser
import tree_sitter_embedded_template as ts_embedded_template
# Example ERB content
CODE = b"""
<h1>Hello, <%= name %>!</h1>
<% if items.any? %>
<ul>
<% items.each do |item| %>
<li><%= item %></li>
<% end %>
</ul>
<% else %>
<p>No items to display.</p>
<% end %>
"""
# Ensure the tree-sitter library is built and loaded
# For pre-compiled grammars like this, installation handles it.
# For custom grammars, Language.build_library might be needed (but is deprecated in newer versions).
# Load the embedded template language
EMBEDDED_TEMPLATE_LANGUAGE = Language(ts_embedded_template.language())
# Create a parser and set its language
parser = Parser()
parser.set_language(EMBEDDED_TEMPLATE_LANGUAGE)
# Parse the code
tree = parser.parse(CODE)
# Get the root node of the syntax tree
root_node = tree.root_node
# Print basic information about the root node
print(f"Root Node Type: {root_node.type}")
print(f"Root Node Text: {root_node.text.decode('utf8')}")
print(f"Number of children: {len(root_node.children)}")
# Example: Find all 'erb_interpolation' nodes
# Note: This is a basic traversal. For complex queries, use Tree-sitter's query API.
for child in root_node.children:
if child.type == 'template_content':
for grandchild in child.children:
if grandchild.type == 'erb_interpolation':
print(f"Found ERB Interpolation: {grandchild.text.decode('utf8')} (Type: {grandchild.type})")
elif grandchild.type == 'erb_statement':
print(f"Found ERB Statement: {grandchild.text.decode('utf8')} (Type: {grandchild.type})")