KvikIO (CUDA 12)
KvikIO is a Python and C++ library designed for high-performance file I/O, providing Python and C++ bindings to cuFile, which enables GPUDirect Storage (GDS). As part of the RAPIDS suite, it efficiently handles both host and device (GPU) memory I/O. The library is actively maintained with frequent, typically monthly or bi-monthly, releases aligned with the broader RAPIDS ecosystem. The current version is 26.4.0.
Common errors
-
ERROR: Could not build wheels for kvikio which use PEP 517 and cannot be installed directly
cause This error typically occurs when `pip` attempts to build KvikIO from source, but required build dependencies (like a C++ compiler, CUDA Toolkit headers, or Cython) are missing or misconfigured in the environment.fixEnsure you have a C++ compiler (e.g., `gcc`), the CUDA Toolkit (12.x) installed and accessible, and potentially `cython`. For most users, installing via `conda` or `mamba` from the `rapidsai` channel is recommended to handle complex dependencies automatically: `mamba install -c rapidsai -c conda-forge kvikio`. -
RuntimeError: cuFile API error or GPUDirect Storage (GDS) not available.
cause KvikIO attempted to use cuFile/GDS but failed. This could be due to missing NVIDIA drivers, an incorrectly configured cuFile system, an unsupported file system, or GDS not being enabled/compatible with your hardware/software stack.fixVerify NVIDIA drivers are up to date, cuFile is installed and configured (refer to NVIDIA documentation), and that your storage system (NVMe, NVMe-oF) and kernel modules support GDS. You can try setting the environment variable `KVIKIO_COMPAT_MODE=ON` to force KvikIO to fall back to POSIX I/O if GDS is not critical for your use case. -
AssertionError: Not all elements are equal (when comparing CuPy arrays after read/write)
cause Data corruption or incomplete I/O. This can happen if buffers are not correctly sized, file operations are interrupted, or if there are issues with unaligned reads/writes when GDS is active but not properly handled.fixDouble-check buffer sizes, file offsets, and ensure I/O operations complete fully. For GDS performance-critical paths, confirm byte ranges are 4KB-aligned. Also, check for file system integrity and sufficient disk space.
Warnings
- breaking KvikIO versions 25.12.00 and newer (including 26.x.x) no longer support building or running without a CUDA installation. A working CUDA 12 Toolkit is a hard requirement for `libkvikio-cu12`.
- breaking Starting with version 25.10.00, Python nvCOMP bindings and direct Zarr 2 support have been removed. Users relying on these specific features for compression or Zarr 2 may need to adjust their workflows or use older KvikIO versions.
- breaking Version 25.08.00 removed CUDA 11 from its supported dependencies. Users on older CUDA environments (e.g., CUDA 11) must use KvikIO versions prior to 25.08.00 or upgrade their CUDA environment to 12.x.
- gotcha For optimal performance with GPUDirect Storage, I/O operations (reads/writes) should be aligned to a GPU page boundary (typically 4KB). Unaligned operations may still work but KvikIO has to split them, which can reduce performance.
- breaking In version 24.12.00, KvikIO shifted to being built as a shared library and introduced a new 'AUTO' compatibility mode. This may affect linking for C++ users and change default fallback behavior for Python users when GDS is unavailable.
Install
-
pip install libkvikio-cu12
Imports
- kvikio
import kvikio
- CuFile
import kvikio.CuFile
from kvikio.cufile import CuFile
Quickstart
import os
import cupy
from kvikio.cufile import CuFile
# Ensure a temporary file path is available
file_path = os.environ.get('KVIKIO_TEST_FILE_PATH', '/tmp/kvikio-example-data.bin')
# Create a CuPy array on the GPU
a = cupy.arange(100, dtype=cupy.int64)
print(f"Writing CuPy array to {file_path} using KvikIO...")
# Write the array to a file using KvikIO
with CuFile(file_path, 'w') as f:
f.write(a)
print(f"Reading data from {file_path} back into a CuPy array...")
# Read data back into a new CuPy array
b = cupy.empty_like(a)
with CuFile(file_path, 'r') as f:
f.read(b)
# Verify the data
assert cupy.array_equal(a, b)
print("Data written and read successfully, and arrays match!")
# Clean up the test file
if os.path.exists(file_path):
os.remove(file_path)
print(f"Cleaned up {file_path}")