JSON grammar for Tree-sitter
tree-sitter-json provides the JSON language grammar for the `tree-sitter` parsing library. It enables parsing JSON source code into a concrete syntax tree, allowing for efficient structural analysis, manipulation, and syntax highlighting. The library is currently at version 0.24.8 and typically releases new versions in sync with the upstream `tree-sitter` core, which has a relatively active release cadence.
Warnings
- breaking The `tree-sitter` core library (which `tree-sitter-json` relies on) had an internal ABI bump to version 15 in v0.25.0 (February 2025). Ensure your installed `tree-sitter` Python package is compatible with the ABI version used to compile `tree-sitter-json`. Mismatched versions can lead to crashes or unexpected behavior.
- gotcha The `parser.parse()` method expects input as `bytes`, not a standard Python `str`. Providing a string will result in a `TypeError`.
- gotcha The `tree-sitter-json` package only provides the grammar. The actual parsing engine and Python bindings are provided by the `tree-sitter` package. Forgetting to install `tree-sitter` will prevent `tree-sitter-json` from being used.
Install
-
pip install tree-sitter-json
Imports
- Language
from tree_sitter import Language, Parser
- tree_sitter_json.language
import tree_sitter_json JSON_LANGUAGE = Language(tree_sitter_json.language())
Quickstart
import tree_sitter_json
from tree_sitter import Language, Parser
# Load the JSON language grammar
JSON_LANGUAGE = Language(tree_sitter_json.language())
# Create a parser and set its language
parser = Parser()
parser.set_language(JSON_LANGUAGE)
# Example JSON source code (must be bytes)
json_code = b'''
{
"name": "Alice",
"age": 30,
"isStudent": false,
"courses": ["Math", "Science"]
}
'''
# Parse the code
tree = parser.parse(json_code)
# Get the root node and print its type and text
root_node = tree.root_node
print(f"Root node type: {root_node.type}")
print(f"Root node text: {root_node.text.decode('utf8')}")
# Example: Iterate through top-level object pairs (requires understanding of JSON grammar nodes)
if root_node.type == 'object':
for child in root_node.children:
if child.type == 'pair':
key_node = child.child_by_field_name('key')
value_node = child.child_by_field_name('value')
if key_node and value_node:
print(f" Key: {key_node.text.decode('utf8')}, Value: {value_node.text.decode('utf8')}")