NVIDIA cuDNN Runtime Libraries for CUDA 12

raw JSON →
9.20.0.48 verified Tue May 12 auth: no python install: stale quickstart: stale

The `nvidia-cudnn-cu12` package provides the NVIDIA CUDA Deep Neural Network (cuDNN) runtime libraries, which are GPU-accelerated primitives essential for deep neural network operations such as convolutions, attention, and matrix multiplication. It acts as a critical low-level dependency, enabling deep learning frameworks like TensorFlow and PyTorch to efficiently leverage NVIDIA GPUs. This specific package targets CUDA 12.x environments. The current version is 9.20.0.48, with releases frequently updated to align with new CUDA Toolkit versions and cuDNN backend enhancements.

pip install nvidia-cudnn-cu12
error Could not load dynamic library 'cudnn64_9.dll'
cause The system or deep learning framework cannot find the necessary cuDNN shared libraries because they are either not installed correctly, their location is not in the system's PATH (Windows) or LD_LIBRARY_PATH (Linux) environment variable, or there's a file permissions issue.
fix
Ensure CUDA Toolkit and nvidia-cudnn-cu12 are installed. For manual cuDNN installations (less common with the wheel), add the bin directory of your CUDA and cuDNN installation (e.g., C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\bin and C:\Program Files\NVIDIA\CUDNN\v9.x\bin on Windows) to your system's PATH environment variable, and on Linux, add the lib64 directories to LD_LIBRARY_PATH. If using the nvidia-cudnn-cu12 wheel, verify that the package installed its components in discoverable locations or that the framework is correctly configured to use pip-installed NVIDIA libraries.
error RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
cause cuDNN failed to initialize, often due to insufficient GPU memory, an outdated or incompatible GPU driver, or a problem during the initial setup of the cuDNN context within the deep learning framework.
fix
Update your GPU drivers to the latest compatible version for your CUDA toolkit. Ensure enough GPU memory is available before running operations that utilize cuDNN, potentially by clearing the PyTorch cache (torch.cuda.empty_cache()) or reducing batch sizes. Verify that CUDA and cuDNN are correctly installed and compatible with your deep learning framework.
error Loaded runtime CuDNN library: X but source was compiled with: Y. CuDNN library needs to have matching major version and equal or higher minor version.
cause The version of the cuDNN library loaded at runtime (either from the system or an installed wheel) does not match the version that the deep learning framework (e.g., TensorFlow, JAX, or PyTorch) was compiled against. This can occur when mixing different installation methods or having multiple cuDNN versions accessible.
fix
Ensure that the installed nvidia-cudnn-cu12 version is compatible with your deep learning framework's specific CUDA and cuDNN requirements. If using pip, try reinstalling the framework to ensure it picks up the correct cuDNN dependencies or specify the exact nvidia-cudnn-cu12 version required by your framework. Use virtual environments to isolate different CUDA/cuDNN configurations.
error E: Unable to locate package cudnn9-cuda-12
cause The Linux package manager (apt/dnf/zypper) cannot find the specified cuDNN package for CUDA 12 in its configured repositories. This can happen if the NVIDIA repositories are not correctly added or enabled, or the package name is incorrect for the specific cuDNN and CUDA version combination.
fix
Verify that the NVIDIA CUDA repository for your distribution is correctly added and updated. Use apt-cache search cudnn (Ubuntu/Debian) or similar commands to find the exact package names available for your CUDA version (e.g., libcudnn8 instead of cudnn9-cuda-12). Follow the official NVIDIA cuDNN installation guide for your Linux distribution and CUDA version.
error cuDNN Error: CUDNN_STATUS_BAD_PARAM
cause A deep learning operation using cuDNN was called with invalid input parameters (e.g., incorrect tensor dimensions, incompatible data types, or invalid convolution settings). This error often indicates a logical issue in the application code rather than an installation problem.
fix
Carefully review the code performing the operation that triggers the error. Check tensor shapes, data types (e.g., float vs. long), and convolution parameters (padding, stride, dilation) to ensure they are valid and consistent with cuDNN's requirements and the neural network architecture. Switching to CPU for debugging can sometimes provide more detailed error messages.
breaking Starting with CUDA 12.5 and later, cuDNN is no longer bundled directly within the CUDA Toolkit installer. This change requires users (especially C++ toolchain developers) to manage cuDNN installation and versioning separately, although `pip install nvidia-cudnn-cu12` simplifies this for Python environments.
fix Ensure you are installing `nvidia-cudnn-cu12` (or the appropriate CUDA version) via pip, or manually managing separate cuDNN archives if not using Python wheels, and verify compatibility with your CUDA Toolkit version.
gotcha Direct Python API calls for `nvidia-cudnn-cu12` are not available. This package provides the low-level runtime binaries. To programmatically interact with cuDNN functionality in Python (e.g., build computation graphs), you must install the `nvidia-cudnn-frontend` package separately and import it as `cudnn`.
fix If you intend to use cuDNN's API directly in Python, install `nvidia-cudnn-frontend` via `pip install nvidia-cudnn-frontend` and then `import cudnn` in your Python code.
gotcha Version compatibility between `nvidia-cudnn-cu12`, the installed NVIDIA CUDA Toolkit, and your deep learning framework (e.g., TensorFlow, PyTorch) is crucial. Frameworks are often built against specific cuDNN versions. Installing a standalone `nvidia-cudnn-cu12` might not be compatible with the version your framework expects, leading to runtime errors (e.g., 'DLL load failed' or 'cuDNN initialization error').
fix Always refer to the official documentation of your deep learning framework for recommended or required CUDA and cuDNN versions. When possible, allow the framework's installation (e.g., `pip install tensorflow[and-cuda]`) to manage cuDNN dependencies, or carefully match versions if installing separately.
gotcha `nvidia-cudnn-cu12` implies compatibility with CUDA Toolkit 12.x. Using it with an older or incompatible CUDA Toolkit version installed on your system can lead to runtime issues or failures in GPU acceleration.
fix Ensure your installed NVIDIA CUDA Toolkit version is 12.x and that your GPU drivers are up-to-date and compatible with CUDA 12.x.
breaking The `nvidia-cudnn-cu12` package is a placeholder that requires downloading the actual wheel from NVIDIA's PyPI index. If `https://pypi.nvidia.com` is not implicitly used or specified, `pip` will fail to find or download the actual package, leading to an 'Didn't find wheel' error during metadata preparation.
fix Install the package by explicitly adding NVIDIA's PyPI as an extra index URL: `pip install --extra-index-url https://pypi.nvidia.com nvidia-cudnn-cu12`.
pip install nvidia-cudnn-cu12==9.20.0.48
python os / libc variant status wheel install import disk
3.10 alpine (musl) nvidia-cudnn-cu12 - - - -
3.10 alpine (musl) nvidia-cudnn-cu12==9.20.0.48 - - - -
3.10 slim (glibc) nvidia-cudnn-cu12 - - - -
3.10 slim (glibc) nvidia-cudnn-cu12==9.20.0.48 - - - -
3.11 alpine (musl) nvidia-cudnn-cu12 - - - -
3.11 alpine (musl) nvidia-cudnn-cu12==9.20.0.48 - - - -
3.11 slim (glibc) nvidia-cudnn-cu12 - - - -
3.11 slim (glibc) nvidia-cudnn-cu12==9.20.0.48 - - - -
3.12 alpine (musl) nvidia-cudnn-cu12 - - - -
3.12 alpine (musl) nvidia-cudnn-cu12==9.20.0.48 - - - -
3.12 slim (glibc) nvidia-cudnn-cu12 - - - -
3.12 slim (glibc) nvidia-cudnn-cu12==9.20.0.48 - - - -
3.13 alpine (musl) nvidia-cudnn-cu12 - - - -
3.13 alpine (musl) nvidia-cudnn-cu12==9.20.0.48 - - - -
3.13 slim (glibc) nvidia-cudnn-cu12 - - - -
3.13 slim (glibc) nvidia-cudnn-cu12==9.20.0.48 - - - -
3.9 alpine (musl) nvidia-cudnn-cu12 - - - -
3.9 alpine (musl) nvidia-cudnn-cu12==9.20.0.48 - - - -
3.9 slim (glibc) nvidia-cudnn-cu12 - - - -
3.9 slim (glibc) nvidia-cudnn-cu12==9.20.0.48 - - - -

This quickstart demonstrates how to verify that a deep learning framework like TensorFlow can detect and utilize your GPU, and implicitly, the underlying cuDNN runtime. Successful execution of this code confirms that `nvidia-cudnn-cu12` is likely correctly installed and accessible by TensorFlow. Note that the cuDNN version reported by TensorFlow is the version it was *built with*, not necessarily the exact version dynamically loaded, though they should be compatible.

import tensorflow as tf
import os

# Ensure TensorFlow doesn't pre-allocate all GPU memory
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'

# Check if TensorFlow can detect and use GPUs
gpus = tf.config.list_physical_devices('GPU')

if gpus:
    print(f"TensorFlow detected the following GPUs: {gpus}")
    try:
        # Limit GPU memory growth to avoid allocating all memory at once (alternative to env var)
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        print("GPU memory growth set to True.")
    except RuntimeError as e:
        # Memory growth must be set before GPUs have been initialized
        print(f"Error setting memory growth: {e}")
    print(f"TensorFlow is built with CUDA: {tf.test.is_built_with_cuda()}")
    # TensorFlow's built-in cuDNN version (indicates what TF was compiled with)
    print(f"TensorFlow's built-in cuDNN version: {tf.sysconfig.get_build_info().get('CUDNN_VERSION', 'N/A')}")

    # A small operation to trigger GPU usage if available
    try:
        with tf.device('/GPU:0'):
            a = tf.constant([[1.0, 2.0], [3.0, 4.0]])
            b = tf.constant([[1.0, 1.0], [1.0, 1.0]])
            c = tf.matmul(a, b)
            print(f"Simple matrix multiplication on GPU: {c.numpy()}")
    except RuntimeError as e:
        print(f"Could not run on GPU: {e}. Running on CPU instead.")
        a = tf.constant([[1.0, 2.0], [3.0, 4.0]])
        b = tf.constant([[1.0, 1.0], [1.0, 1.0]])
        c = tf.matmul(a, b)
        print(f"Simple matrix multiplication on CPU: {c.numpy()}")
else:
    print("TensorFlow did not detect any GPUs. Please ensure CUDA and cuDNN are correctly installed and configured.")