tree-sitter-c C Grammar
tree-sitter-c is a Python package that provides the C language grammar for the Tree-sitter parsing library. It enables Python applications to parse C code into concrete syntax trees for advanced analysis, tooling, and transformation. The library is actively maintained, with frequent minor releases ensuring ongoing compatibility with the latest `tree-sitter` core and C language specifications.
Warnings
- breaking The `Language.build_library` method, previously used to compile grammars from source, has been removed from the `tree-sitter` Python bindings (around version 0.21.x).
- gotcha Incompatibility between the installed `tree-sitter` core library version and the `tree-sitter-c` grammar package version can lead to runtime errors (e.g., 'version-mismatch').
- gotcha The `tree-sitter` parser expects source code to be provided as `bytes` objects, typically UTF-8 encoded.
Install
-
pip install tree-sitter tree-sitter-c
Imports
- Language
from tree_sitter import Language, Parser
- language
import tree_sitter_c as tsc C_LANGUAGE = Language(tsc.language())
Quickstart
import tree_sitter_c as tsc
from tree_sitter import Language, Parser
# Load the C language grammar
C_LANGUAGE = Language(tsc.language())
# Create a parser and set its language
parser = Parser()
parser.set_language(C_LANGUAGE)
# C code to parse
c_code = b"""
int main() {
printf("Hello, Tree-sitter C!");
return 0;
}
"""
# Parse the code
tree = parser.parse(c_code)
# Get the root node and print its type
root_node = tree.root_node
print(f"Root node type: {root_node.type}")
print(f"Root node text: {root_node.text.decode('utf8')}")
# Example of traversing a child node
if root_node.children:
first_child = root_node.children[0]
print(f"First child type: {first_child.type}")