Tree-sitter Language Pack

1.5.0 · active · verified Thu Apr 09

tree-sitter-language-pack is a Python library providing pre-compiled Tree-sitter parsers for over 300 programming languages. It offers a unified `process()` API for efficient parsing, advanced code analysis, and intelligent code chunking. The library is actively maintained with frequent releases, typically introducing new language grammars and API enhancements.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates the core `process()` API to extract structured intelligence from source code, including functions, diagnostics, and comments. It also shows how to use AST-aware chunking for breaking down larger code blocks. The library automatically downloads necessary language parsers on demand.

from tree_sitter_language_pack import process, ProcessConfig
import os

source_code = """
def hello():
    # This is a comment
    print("Hello, World!")

class MyClass:
    def __init__(self):
        pass
"""

# Process source code for intelligence extraction (auto-downloads language if needed)
result = process(source_code, ProcessConfig(language="python"))

print(f"Functions found: {len(result.get('structure', []))}")
print(f"Diagnostics: {result.get('diagnostics', [])}")
print(f"Comments: {result.get('comments', [])}")

# Example with AST-aware chunking
chunked_result = process(
    source_code,
    ProcessConfig(language="python", chunk_max_size=50, comments=True)
)
print(f"Chunks found: {len(chunked_result.get('chunks', []))}")

view raw JSON →