CUDA Python
CUDA Python provides a high-performance Python interface to NVIDIA's CUDA Driver and Runtime APIs, allowing direct GPU programming from Python. It bridges Python applications with CUDA-enabled GPUs, enabling GPU acceleration for custom kernels and integration with other CUDA libraries. The current version is 13.2.0 and releases generally align with major CUDA Toolkit updates.
Warnings
- breaking The `cuda-python` package itself does NOT install the CUDA Toolkit or NVIDIA drivers. These are system-level prerequisites that must be installed separately and be compatible with your GPU. Installing `cuda-python` via pip only provides the Python bindings.
- gotcha Compatibility between `cuda-python` package version and the system's CUDA Toolkit version is crucial. While minor version mismatches might work, major version mismatches (e.g., `cuda-python==12.x` with CUDA Toolkit 11.x) are likely to cause `ImportError` or runtime errors.
- gotcha The library exposes multiple API interfaces (e.g., `cuda.cuda_driver` for Driver API, `cuda.cuda_runtime` or `from cuda import cudart` for Runtime API). Choosing the correct API for your specific task (e.g., low-level control vs. higher-level abstractions, or integration with other libraries like Numba/PyTorch) is important.
- gotcha When using the low-level CUDA Driver API (`cuda.cuda_driver`), memory allocation and deallocation on the GPU (e.g., `drv.cuMemAlloc`, `drv.cuMemFree`) must be managed manually. Forgetting to free allocated memory can lead to GPU memory leaks and resource exhaustion.
Install
-
pip install cuda-python
Imports
- cuda_driver
import cuda.cuda_driver as drv
- cuda_runtime
import cuda.cuda_runtime as rt
- cudart
from cuda import cudart
Quickstart
import cuda.cuda_driver as drv
try:
# Initialize the CUDA driver API
# The '0' indicates the flags for initialization, 0 means default.
drv.cuInit(0)
# Get the number of available CUDA devices
err, device_count = drv.cuDeviceGetCount()
if err == drv.CUresult.CUDA_SUCCESS:
print(f"Successfully initialized CUDA. Found {device_count} CUDA devices.")
for i in range(device_count):
err, device = drv.cuDeviceGet(i)
if err == drv.CUresult.CUDA_SUCCESS:
# Get device name (256 is max length)
err, name_bytes = drv.cuDeviceGetName(256, device)
if err == drv.CUresult.CUDA_SUCCESS:
# Decode the bytes to string and strip null terminators
device_name = name_bytes.decode('utf-8').strip('\x00')
print(f" Device {i}: {device_name}")
else:
print(f"Failed to get CUDA device count. Error: {err.name}")
except drv.CUError as e:
print(f"A CUDA driver error occurred: {e}. Ensure CUDA Toolkit and drivers are installed correctly and compatible.")
except Exception as e:
print(f"An unexpected error occurred: {e}")