Python Bindings for libclang
The `clang` package provides Python bindings for libclang, enabling programmatic interaction with Clang's C/C++/Objective-C abstract syntax trees (ASTs), parsing source code, and accessing compiler information. It is maintained as part of the broader LLVM project and typically releases in sync with major LLVM versions. The current version is 21.1.7.
Warnings
- gotcha The `clang` PyPI package provides Python bindings ONLY. It does NOT include the `libclang` C++ shared library, which is a fundamental dependency. You must install `libclang` separately on your operating system.
- gotcha After installing `libclang`, the Python bindings might not find it automatically. You may need to explicitly tell the bindings where to find `libclang.so` (Linux), `libclang.dylib` (macOS), or `libclang.dll` (Windows).
- gotcha There can be compatibility issues if the version of the `clang` Python package does not match the version of the system `libclang` installed. Mismatches can lead to crashes, incorrect parsing, or unexpected behavior.
- breaking The `clang.cindex` API, being a binding to a C++ library, can have subtle breaking changes between major LLVM versions (e.g., how AST nodes are iterated, new cursor kinds, or changes in diagnostic information structure).
- gotcha Correctly configuring compiler arguments (e.g., include paths, language standard, preprocessor definitions) is crucial for accurate parsing of C/C++ source code, especially for complex projects.
Install
-
pip install clang
Imports
- Index
from clang.cindex import Index
- Config
from clang.cindex import Config
- Cursor
from clang.cindex import Cursor
- TranslationUnit
from clang.cindex import TranslationUnit
Quickstart
import os
from clang.cindex import Index, Config, TranslationUnit # type: ignore
# CRITICAL: Ensure libclang (the C++ library) is installed and discoverable.
# The 'clang' pip package only provides Python wrappers.
#
# On macOS (with Homebrew LLVM):
# Config.set_library_path('/usr/local/opt/llvm/lib')
# On Linux (e.g., Ubuntu/Debian LLVM-18):
# Config.set_library_path('/usr/lib/llvm-18/lib')
# On Windows, ensure 'libclang.dll' is in your PATH or set its full path.
# Alternatively, set the CLANG_LIBRARY_PATH environment variable.
source_code = """
#include <stdio.h>
int add(int a, int b) {
return a + b;
}
int main() {
printf("Hello from Clang AST!");
int result = add(5, 7);
return 0;
}
"""
try:
index = Index.create()
# Parse the source code from a string. 'main.c' is a dummy name.
# args can include compiler flags, e.g., ['-std=c99', '-I/path/to/includes']
tu = index.parse('main.c', unsaved_files=[('main.c', source_code)], args=['-x', 'c'])
# Check for diagnostics (errors, warnings) during parsing
if tu.diagnostics:
for diag in tu.diagnostics:
if diag.severity >= 3: # Error or Fatal
print(f"Diagnostic (Error): {diag.spelling} at {diag.location}")
# Walk the AST and print top-level declarations
print("\nTop-level declarations:")
for cursor in tu.cursor.get_children():
if cursor.location.file and cursor.location.file.name == 'main.c':
print(f"- {cursor.kind.name}: {cursor.spelling} at Line {cursor.location.line}")
# Example: Finding the 'main' function and its call to 'add'
for cursor in tu.cursor.walk_preorder():
if cursor.kind.is_function() and cursor.spelling == 'main':
print(f"\nFound 'main' function at Line {cursor.location.line}")
for child in cursor.get_children():
if child.kind.is_call_expr() and child.spelling == 'add':
print(f" -> 'main' calls 'add' at Line {child.location.line}")
break
except Exception as e:
print(f"An error occurred: {e}")
print("Hint: Make sure libclang is installed on your system and its path is correctly configured.")