NVRTC Native Runtime Libraries for CUDA 11
NVRTC (NVIDIA Runtime Compilation) is a runtime compilation library for CUDA C++ that enables just-in-time (JIT) compilation of CUDA kernels from source code into PTX (Parallel Thread Execution) code. This Python package (`nvidia-cuda-nvrtc-cu11`) provides the native shared libraries for NVRTC specifically for CUDA 11.x environments. It acts as a foundational component for higher-level Python bindings and frameworks that leverage dynamic CUDA kernel generation. The current version is 11.8.89, with its initial release on October 3, 2022, and subsequent wheel metadata updates on August 16, 2024.
Warnings
- breaking NVRTC in CUDA 11.0 and later no longer implicitly adds `/usr/include` to the header file search path during compilation.
- gotcha This package (`nvidia-cuda-nvrtc-cu11`) provides only the native NVRTC shared libraries. It does not expose a direct Python API itself. Users must install a separate Python binding library, such as `pynvrtc` or `cuda-python`, to programmatically interact with NVRTC.
- gotcha Mismatch between the installed `nvidia-cuda-nvrtc-cu11` version (or the CUDA version it implies) and the NVIDIA GPU driver or other CUDA-dependent libraries (like PyTorch or CuPy) can lead to `ImportError: libnvrtc.so.<VERSION> not found`.
- gotcha NVRTC requires a compatible NVIDIA GPU and an installed NVIDIA display driver to function. While the NVRTC library itself can run on a system without a GPU, its utility for generating PTX for execution implies a GPU target.
Install
-
pip install nvidia-cuda-nvrtc-cu11
Imports
- Program
from pynvrtc.compiler import Program
- NVRTCInterface
from pynvrtc.interface import NVRTCInterface
Quickstart
import os
from pynvrtc.compiler import Program, ProgramException
# Example CUDA C++ kernel source code
cuda_source_code = '''
extern "C" __global__
void add(int *a, int *b, int *c, int N)
{
int idx = blockIdx.x * blockDim.x + threadIdx.x;
if (idx < N)
{
c[idx] = a[idx] + b[idx];
}
}
'''
try:
# Compile the CUDA source code to PTX using the Program API
# The nvidia-cuda-nvrtc-cu11 library is implicitly used by pynvrtc
program = Program(cuda_source_code, 'add_kernel.cu')
ptx_code = program.compile(['-arch=compute_60']) # Adjust arch for your GPU
print("PTX code generated successfully. First 200 chars:\n", ptx_code[:200], '...')
# In a real application, ptx_code would then be loaded and executed
# using a CUDA driver API wrapper (e.g., from `cuda-python` or `pycuda`)
# This part requires more setup (context, module, kernel launch) and is omitted for brevity.
except ProgramException as e:
print(f"Error during NVRTC compilation: {e}")
except Exception as e:
print(f"An unexpected error occurred: {e}")