TOML Grammar for Tree-sitter in Python
tree-sitter-toml provides a TOML grammar for the Tree-sitter parsing library, with Python bindings and pre-compiled wheels. It enables efficient, incremental parsing of TOML files, generating concrete syntax trees for various programming tools. The library is currently at version 0.7.0 and is actively maintained within the Tree-sitter grammars ecosystem.
Warnings
- breaking The underlying `tree-sitter` core library (which `py-tree-sitter` binds to) may introduce breaking changes, particularly with its Application Binary Interface (ABI). A major ABI version bump in `tree-sitter` can render `tree-sitter-toml` incompatible until it's updated and recompiled for the new ABI, potentially leading to `Language` loading failures.
- gotcha Changes to the `tree-sitter-toml` grammar definition, even minor ones, can alter the structure of the generated syntax tree. Existing Tree-sitter queries (used for pattern matching or navigation) written against an older grammar structure might become invalid or return incorrect results.
- gotcha The `parser.parse()` method of `tree-sitter` strictly requires the input source code to be a `bytes` object, usually UTF-8 encoded. Passing a standard Python string or an incorrectly encoded byte string will lead to a `TypeError` or parsing errors.
Install
-
pip install tree-sitter-toml tree-sitter
Imports
- language
import tree_sitter_toml from tree_sitter import Language, Parser TOML_LANGUAGE = Language(tree_sitter_toml.language())
Quickstart
import tree_sitter_toml
from tree_sitter import Language, Parser
# Load the TOML grammar
TOML_LANGUAGE = Language(tree_sitter_toml.language())
# Create a parser instance
parser = Parser()
parser.set_language(TOML_LANGUAGE)
# Sample TOML source code
toml_code = b'''
[package]
name = "my-app"
version = "0.1.0"
authors = ["John Doe <john@example.com>"]
'''
# Parse the code
tree = parser.parse(toml_code)
# Get the root node and print its type
root_node = tree.root_node
print(f"Root Node Type: {root_node.type}")
# Example: Find a 'name' key-value pair
def find_name(node):
for child in node.children:
if child.type == 'key_value_pair':
if child.child_by_field_name('key') and child.child_by_field_name('key').text == b'name':
print(f"Found name: {child.child_by_field_name('value').text.decode()}")
find_name(child)
find_name(root_node)