NIXL Python API (CUDA 12)
NIXL (NVIDIA Inference Xfer Library) is an open-source Python API designed to accelerate point-to-point communications in AI inference frameworks. It provides a unified abstraction layer over various memory types (CPU, GPU) and storage (file, block, object store) through a modular plugin architecture. The `nixl-cu12` package specifically targets CUDA 12 environments. NIXL is actively maintained with frequent releases, with version 1.0.0 being the current stable release.
Warnings
- breaking NIXL 1.0.0 introduces significant breaking changes, including the removal of the legacy Multi-Object UCX backend and the complete transition to Device API V2. The previous Device API V1 implementation has been removed.
- gotcha Installing both `nixl-cu12` and `nixl-cu13` (or their respective meta-package variants) in the same environment can lead to unexpected behavior. If both are present, `nixl-cu13` will take precedence.
- gotcha NIXL is currently only supported on Linux environments (tested on Ubuntu and Fedora). It does not support macOS or Windows.
- gotcha Errors such as `NIXL_ERR_BACKEND` often indicate issues with the underlying backend configuration or its availability. This can stem from missing dependencies (like UCX or specific CUDA libraries), incorrect paths, or misconfigurations during agent initialization.
Install
-
pip install nixl-cu12 -
pip install "nixl[cu12]" # for CUDA 12 # For backwards compatibility, 'pip install nixl' automatically installs nixl[cu12]
Imports
- nixl_agent
from nixl import nixl_agent
Quickstart
import nixl
# Initialize a NIXL agent
try:
agent = nixl.nixl_agent('my_inference_agent')
print(f"NIXL agent '{agent.name}' initialized successfully.")
# Further NIXL operations would follow here, e.g., memory registration, transfer requests.
except Exception as e:
print(f"Error initializing NIXL agent: {e}")