Tree-sitter Bash Grammar
tree-sitter-bash provides the Bash grammar for use with the Tree-sitter parsing library. It enables robust, error-tolerant parsing of Bash scripts and shell code, allowing for syntax highlighting, code navigation, and refactoring tools. The current version is 0.25.1, and it maintains a regular release cadence, often aligning with updates to the core Tree-sitter project.
Warnings
- gotcha While `tree-sitter` is a direct dependency and installed automatically, the core `tree-sitter` library's C extensions sometimes require a C compiler and development headers (e.g., `gcc`, `clang`, `make`) to be present on the system. If these build tools are missing, `pip install tree-sitter-bash` might fail during the `tree-sitter` dependency installation.
- gotcha The `parser.parse()` method strictly expects input as `bytes`, not a Python `str`. Passing a `str` will result in a `TypeError`.
- breaking Minor updates to the underlying `tree-sitter` library or changes in the `tree-sitter-bash` grammar definition (even across patch versions) can subtly alter the generated syntax tree structure. This includes changes to node types, field names, or the presence/absence of anonymous nodes. Code relying on specific tree traversals or node names might break.
- gotcha This package (`tree-sitter-bash`) provides *only* the Bash grammar. To parse other programming languages, you will need to install separate `tree-sitter-*` grammar packages (e.g., `tree-sitter-python`) or utilize libraries like `tree-sitter-languages` which bundle many pre-compiled grammars.
Install
-
pip install tree-sitter-bash
Imports
- language
from tree_sitter_bash import language
Quickstart
import tree_sitter
from tree_sitter_bash import language
# Initialize the parser and set the Bash language
parser = tree_sitter.Parser()
parser.set_language(language)
# Bash code to parse
bash_code = '''
#!/bin/bash
echo "Hello, Tree-sitter!"
# Loop example
for i in $(seq 1 3); do
echo "Count: $i"
done
'''
# Parse the code (input must be bytes)
tree = parser.parse(bytes(bash_code, "utf8"))
# Print the S-expression representation of the syntax tree
print("\n--- S-expression Tree ---")
print(tree.root_node.sexp())
# Example: Find all command nodes
def find_nodes_by_type(node, node_type):
nodes = []
if node.type == node_type:
nodes.append(node)
for child in node.children:
nodes.extend(find_nodes_by_type(child, node_type))
return nodes
command_nodes = find_nodes_by_type(tree.root_node, 'command')
print(f"\nFound {len(command_nodes)} command nodes in the script.")
# For example, 'echo "Hello"' is one command, 'echo "Count"' is another.