CUDA Python Bindings
cuda-bindings provides low-level Python wrappers for the NVIDIA CUDA C driver and runtime APIs. It is a core component of the broader NVIDIA 'CUDA Python' initiative, aiming to unify and simplify GPU-accelerated computing in Python. The current version is 13.2.0, with releases often tied to CUDA Toolkit versions and ongoing development to integrate Python as a first-class language in the CUDA ecosystem.
Common errors
-
ModuleNotFoundError: No module named 'cuda'
cause This error typically occurs when the `cuda-python` package (which includes `cuda-bindings`) is not installed, or when Python cannot locate the installed package due to environment configuration issues (e.g., incorrect virtual environment or `PYTHONPATH`).fixEnsure the `cuda-python` package is installed: `pip install cuda-python`. If already installed, verify your Python environment is correctly activated and that the package is accessible. In some cases, a version mismatch with other GPU-accelerated libraries might also cause this, requiring specific `cuda-python` versions. -
ModuleNotFoundError: No module named 'cuda.bindings'
cause This error often indicates that while a `cuda-python` related package might be present, the specific `cuda.bindings` submodule is not found. This can happen due to an older version of `cuda-python` where the module layout was different, or an incomplete/corrupted installation. -
RuntimeError: CUDA driver failed to initialize: <error message>
cause This runtime error signifies a problem with the NVIDIA CUDA driver or its interaction with the `cuda-bindings` library. Common causes include an outdated or incompatible GPU driver, issues with `LD_LIBRARY_PATH` (on Linux) not including CUDA runtime libraries, or a driver mismatch in containerized environments.fixUpdate your NVIDIA GPU drivers to the latest version. For Linux, ensure `LD_LIBRARY_PATH` correctly points to your CUDA installation's `lib64` directory. If using Docker, ensure the NVIDIA Container Toolkit is correctly installed and configured, and the container is run with `--runtime=nvidia --gpus all`. -
RuntimeError: ('Unable to allocate CUDA array:', <cudaError_t.cudaErrorInsufficientDriver: 35>)cause This specific runtime error indicates that the CUDA driver installed on your system is too old for the `cuda-bindings` version you are trying to use. The `pip` installer might select a newer `cuda-bindings` version that requires a more modern device driver.fixUpdate your NVIDIA GPU drivers to the latest version. Alternatively, install a specific version of `cuda-bindings` that is compatible with your current driver, e.g., `pip install cuda-bindings==12.8` (adjust version as needed to match your CUDA Toolkit version). -
from cuda import cuda, cudart
cause While not an error message, this import pattern frequently leads to `ModuleNotFoundError` or `ImportError` because the module layout of `cuda-python` has changed in recent versions. Direct imports like `from cuda import cuda` are often no longer valid or recommended for core driver/runtime APIs.fixFor newer `cuda-python` versions (12.8.0 and above), the recommended way to import driver and runtime APIs is `from cuda.bindings import driver as cuda` and `from cuda.bindings import runtime as cudart`. The top-level `cuda` module now serves as a meta-package.
Warnings
- breaking Mismatch between CUDA Toolkit, NVIDIA GPU driver, and `cuda-bindings` versions is a common source of runtime errors, including 'CUDA Driver Version Insufficient', 'No Kernel Image Available', or failure to find CUDA-enabled devices.
- gotcha Updating `cuda-python` (which `cuda-bindings` is a part of) from older versions (e.g., v12.6.2.post1 and below) using `pip install -U cuda-python` might fail.
- gotcha `cuda-bindings` provides direct, low-level access to the CUDA C APIs. This requires explicit memory management, device context handling, and kernel configuration, which can be more complex than higher-level libraries like Numba CUDA or CuPy.
- gotcha Out-of-memory (OOM) errors or illegal memory access can occur when dealing with large datasets or complex models, especially on GPUs with limited VRAM, or due to incorrect memory operations within CUDA kernels.
- breaking The `cuda-python` or `cuda-bindings` package is not installed or not accessible in the current Python environment, leading to a `ModuleNotFoundError` when attempting to import `cuda.cuda`.
- breaking Installation of `cuda-bindings` (or `cuda-python`) may fail with 'No matching distribution found' errors, particularly when using newer Python versions (e.g., 3.13) or non-standard operating system/architecture combinations (e.g., Alpine Linux, ARM). Pre-built wheels for `cuda-bindings` are often limited to specific Python versions and common `glibc`-based Linux distributions.
Install
-
pip install cuda-bindings
Imports
- cuInit
import cuda.bindings.cuda as cu
import cuda.cuda as cu # ... then call cu.cuInit(0)
Quickstart
import cuda.cuda as cu
import cuda.cuda.runtime as rt
import ctypes # For C types like c_int, c_size_t
# Initialize CUDA Driver API
cu.cuInit(0)
# Get device count
count = ctypes.c_int()
cu.cuDeviceGetCount(ctypes.byref(count))
print(f"Found {count.value} CUDA devices.")
# Get properties for each device
for i in range(count.value):
device = cu.CUdevice()
cu.cuDeviceGet(ctypes.byref(device), i)
name_buffer = ctypes.create_string_buffer(256)
cu.cuDeviceGetName(name_buffer, len(name_buffer), device)
print(f" Device {i}: {name_buffer.value.decode().strip()}")
total_mem = ctypes.c_size_t()
cu.cuDeviceTotalMem(ctypes.byref(total_mem), device)
print(f" Total Memory: {total_mem.value / (1024**3):.2f} GB")