{"id":664,"library":"nvidia-cufft-cu12","title":"NVIDIA cuFFT for CUDA 12","description":"nvidia-cufft-cu12 provides the native runtime libraries for NVIDIA's CUDA Fast Fourier Transform (cuFFT) product, a GPU-accelerated library for performing FFT calculations. It is a fundamental component for various scientific and engineering applications, including deep learning, computer vision, and computational physics. The library is actively maintained by the Nvidia CUDA Installer Team and receives frequent updates; the current version is 11.4.1.4, released on June 5, 2025. It primarily serves as a low-level dependency for higher-level Python frameworks and libraries that leverage GPU-accelerated FFTs.","status":"active","version":"11.4.1.4","language":"python","source_language":"en","source_url":"https://developer.nvidia.com/cufft","tags":["cuda","nvidia","runtime","fft","fast fourier transform","gpu","scientific computing","deep learning","machine learning","signal processing","mathematics"],"install":[{"cmd":"pip install nvidia-cufft-cu12","lang":"bash","label":"Install via pip"},{"cmd":"pip install nvmath-python[cu12]","lang":"bash","label":"Install with nvmath-python (recommended for Python users)"}],"dependencies":[{"reason":"Required runtime dependency for the cuFFT library.","package":"nvidia-nvjitlink-cu12"},{"reason":"Provides Pythonic APIs that leverage cuFFT for direct Python usage. Installing `nvmath-python[cu12]` handles this dependency.","package":"nvmath-python","optional":true},{"reason":"PyTorch natively supports cuFFT for accelerated FFTs on CUDA devices.","package":"torch","optional":true},{"reason":"TensorFlow utilizes cuFFT for GPU-accelerated operations.","package":"tensorflow","optional":true}],"imports":[{"note":"nvidia-cufft-cu12 primarily provides the underlying C/C++ binaries for GPU-accelerated FFTs. Python users typically interact with cuFFT through frameworks like PyTorch, TensorFlow, or dedicated Python wrappers like `nvmath-python`.","symbol":"cuFFT functionality","correct":"This package is a low-level runtime library and is not directly imported in Python. High-level libraries like `nvmath-python`, `torch`, or `tensorflow` provide Python interfaces that utilize `nvidia-cufft-cu12` under the hood."},{"note":"This is the recommended way to directly access cuFFT functionalities in Python via NVIDIA's `nvmath-python` library, which depends on `nvidia-cufft-cu12`.","symbol":"nvmath.fft","correct":"from nvmath.fft import fft, ifft"}],"quickstart":{"code":"import os\nimport nvmath.fft as nvfft\nimport cupy as cp\n\n# Ensure CUDA is available and nvmath-python is correctly set up\n# (e.g., pip install nvmath-python[cu12] and appropriate CUDA Toolkit installation)\n\n# Example: Perform a 1D complex-to-complex FFT using nvmath-python\nsize = 1024\nx = cp.arange(size, dtype=cp.complex64)\n\n# Perform forward FFT\ny = nvfft.fft(x)\n\n# Perform inverse FFT\nz = nvfft.ifft(y)\n\nprint(f\"Original data (first 5 elements): {x[:5].tolist()}\")\nprint(f\"FFT result (first 5 elements): {y[:5].tolist()}\")\nprint(f\"Inverse FFT result (first 5 elements): {z[:5].tolist()}\")\nprint(f\"Difference from original (max abs error): {cp.max(cp.abs(x - z))}\")\n","lang":"python","description":"This quickstart demonstrates how to perform a 1D complex-to-complex FFT and inverse FFT using `nvmath-python`, which leverages the `nvidia-cufft-cu12` runtime library. Ensure `cupy` is also installed (it's a dependency of `nvmath-python[cu12]`) for GPU array operations."},"warnings":[{"fix":"Ensure your GPU hardware has a compute capability of SM50 or higher for CUDA 12.0+ applications.","message":"Deprecated GPU architectures: From CUDA 12.0 onwards, GPU architectures SM35 and SM37 are no longer supported. The minimum required architecture is SM50. Older CUDA versions (e.g., 11.0) also deprecated earlier architectures like SM30.","severity":"breaking","affected_versions":"CUDA 11.0+"},{"fix":"Migrate to Link-Time Optimized (LTO) callbacks, which are supported from CUDA 12.6 Update 2 onwards, to avoid deprecation issues and leverage improved performance.","message":"Legacy cuFFT callback functionality: Support for callback routines using separately compiled device code (legacy callbacks) has been deprecated since CUDA 11.4. CUDA Graphs capture for legacy callbacks that load data in out-of-place mode transforms is no longer supported from CUDA 11.8.","severity":"deprecated","affected_versions":"CUDA 11.4+"},{"fix":"Consider updating to CUDA 12.6 Update 2 or newer and migrating to LTO callbacks, or investigate the performance implications for your specific callback implementations.","message":"Performance degradation with legacy callbacks: Users have reported significant performance decreases (up to 20% or more) when using legacy cuFFT callbacks in CUDA 11.8 and newer (e.g., 12.2, 12.4, 12.9+) compared to CUDA 11.7. This often manifests as increased time spent in `cuMemFree_v2` during `cufftExecC2R` or `R2C` operations.","severity":"gotcha","affected_versions":"CUDA 11.8+"},{"fix":"This issue was not observed in CUDA 12.0 and later. If using CUDA 11.8, ensure your CUDA context management is consistent or consider upgrading to a newer CUDA Toolkit version. Replacing `-cudalib=cufft` with `-lcufft` during compilation was also noted as a workaround.","message":"Memory leak with `nvc++ -cudalib=cufft`: A potential memory leak in cuFFT library version v10.9.0.58 (shipped with CUDA 11.8) when used with `nvc++` and the `-cudalib=cufft` flag. This was linked to cuFFT failing to deallocate internal structures if the active CUDA context at program finalization was not the same used for plan creation.","severity":"gotcha","affected_versions":"CUDA 11.8 (cuFFT v10.9.0.58)"},{"fix":"Avoid using `cudaDeviceReset()` in critical paths before cuFFT plan creation. If absolutely necessary, re-establish the CUDA device context (e.g., with `cudaSetDevice(0)`) after `cudaDeviceReset()`.","message":"Interference of `cudaDeviceReset()` with `cufftPlanMany`: Calling `cudaDeviceReset()` before `cufftPlanMany` can lead to `CUFFT_INTERNAL_ERROR`. While adding `cudaSetDevice(0)` after the reset might mitigate it, `cudaDeviceReset()` is generally not recommended for regular use.","severity":"gotcha","affected_versions":"All versions (observed in CUDA 12.2, 12.4)"},{"fix":"Monitor GPU memory usage for large multi-GPU FFTs. If the error persists, consider reducing data size or reporting as a bug with NVIDIA, as 'internal error' provides limited actionable information.","message":"`CUFFT_INTERNAL_ERROR` in `cufftXtSetGPU` for multi-GPU FFTs: When performing large multi-GPU FFTs, `cufftXtSetGPU` can return an opaque 'internal error,' potentially indicating an out-of-memory condition or an unspecified library issue.","severity":"gotcha","affected_versions":"All versions (observed in CUDA 8.0 with large data)"},{"fix":"Try setting the environment variable `UV_CONCURRENT_DOWNLOADS=1` (for `uv` users) or similar mechanisms to limit concurrent downloads when installing from `pypi.nvidia.com`.","message":"Installation timeouts/failures with concurrent downloads from `pypi.nvidia.com`: Users attempting to install `nvidia-cufft-cu12` (and other NVIDIA PyPI packages) with tools that use concurrent downloads (e.g., `uv`) may experience failures due to timeout or network issues with `pypi.nvidia.com`.","severity":"gotcha","affected_versions":"All versions (related to `pip`/`uv` behavior)"},{"fix":"Install `nvidia-pyindex` first using `pip install nvidia-pyindex`, then install the desired package. Alternatively, configure pip to use the NVIDIA Python Package Index directly by adding `--extra-index-url https://pypi.nvidia.com` to your pip command or by configuring your pip.conf/pip.ini.","message":"Attempting to install NVIDIA Python packages (e.g., `nvidia-cufft-cu12`) directly from PyPI.org will result in a `RuntimeError` indicating the package is a placeholder. These packages are hosted on the NVIDIA Python Package Index and require a specific installation method.","severity":"breaking","affected_versions":"All versions (related to NVIDIA PyPI package installation)"}],"env_vars":null,"last_verified":"2026-05-12T17:35:55.228Z","next_check":"2026-06-26T00:00:00.000Z","problems":[{"fix":"Ensure the NVIDIA CUDA Toolkit is correctly installed and that the directory containing `libcufft.so.X` (e.g., `/usr/local/cuda/lib64` or `C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\vX.X\\bin`) is included in your `LD_LIBRARY_PATH` (Linux) or `PATH` (Windows) environment variable. Also, verify that the CUDA version expected by the application matches your installed CUDA Toolkit. On Linux, you might need to run `sudo ldconfig` after updating `LD_LIBRARY_PATH`. For Python applications, `os.add_dll_directory()` can be used on Windows.","cause":"This error occurs when a program tries to load the cuFFT library at runtime but cannot find the shared object file (libcufft.so.X on Linux or cufft64_X.dll on Windows) because its directory is not in the system's library search path, or an incompatible CUDA Toolkit version is installed.","error":"OSError: libcufft.so.X: cannot open shared object file: No such file or directory"},{"fix":"Add the CUDA Toolkit's include directory to your compiler's include paths. For `nvcc`, you can use the `-I` flag (e.g., `-I/usr/local/cuda/include` or `-cudalib=cufft`). If using a different compiler (like `g++` or `icpc`), explicitly add `-I<CUDA_HOME>/include` to your compilation flags, where `<CUDA_HOME>` is your CUDA installation directory.","cause":"This compilation error indicates that the C/C++ compiler cannot locate the `cufft.h` header file, which is necessary for projects that directly use the cuFFT library's API. This usually happens when the CUDA Toolkit's include directory is not correctly specified in the compiler's search paths.","error":"fatal error: cufft.h: No such file or directory"},{"fix":"Identify the exact `nvidia-cufft-cu12` version required by your main deep learning framework (e.g., PyTorch or TensorFlow) and ensure you install that specific version, or a compatible one. Often, installing the framework with its recommended CUDA support (e.g., `pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121` or `pip install tensorflow[and-cuda]`) will automatically manage these dependencies. If conflicts persist, consider creating a clean virtual environment and installing all required packages together.","cause":"This error occurs when installing Python packages (e.g., PyTorch, TensorFlow) that have specific version requirements for NVIDIA CUDA runtime libraries, including `nvidia-cufft-cu12`. The `pip` dependency resolver detects a conflict between an already installed `nvidia-cufft-cu12` version and the version required by the package being installed.","error":"ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. ... requires nvidia-cufft-cu12==X.Y.Z which is incompatible."},{"fix":"Troubleshoot by checking GPU memory availability before cuFFT operations. Ensure a valid CUDA context is established and not prematurely destroyed (e.g., avoid `cudaDeviceReset()` in critical sections without re-initializing the context). If the problem persists, try updating your NVIDIA drivers and CUDA Toolkit to the latest compatible versions, or simplify your cuFFT calls to isolate the problematic operation.","cause":"This is a generic error from the cuFFT library indicating an internal driver or library issue. It can stem from various problems, including insufficient GPU memory, an invalid CUDA context, or an unexpected state during cuFFT plan creation or execution.","error":"CUFFT_INTERNAL_ERROR"}],"ecosystem":"pypi","meta_description":null,"install_score":0,"install_tag":"stale","quickstart_score":0,"quickstart_tag":"stale","pypi_latest":"11.4.1.4","install_checks":{"last_tested":"2026-05-12","tag":"stale","tag_description":"widespread failures or data too old to trust","results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":6.6,"import_time_s":null,"mem_mb":null,"disk_size":"390M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"cu12","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":56.4,"import_time_s":null,"mem_mb":null,"disk_size":"3.4G"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":6.2,"import_time_s":null,"mem_mb":null,"disk_size":"392M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"cu12","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":54.7,"import_time_s":null,"mem_mb":null,"disk_size":"3.4G"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":5.8,"import_time_s":null,"mem_mb":null,"disk_size":"384M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"cu12","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":49,"import_time_s":null,"mem_mb":null,"disk_size":"3.4G"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":5.7,"import_time_s":null,"mem_mb":null,"disk_size":"383M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"cu12","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":46.2,"import_time_s":null,"mem_mb":null,"disk_size":"3.4G"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":7,"import_time_s":null,"mem_mb":null,"disk_size":"390M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"cu12","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":53,"import_time_s":null,"mem_mb":null,"disk_size":"2.9G"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"cu12","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null}]},"quickstart_checks":{"last_tested":"2026-04-24","tag":"stale","tag_description":"widespread failures or data too old to trust","results":[{"runtime":"python:3.10-alpine","exit_code":1},{"runtime":"python:3.10-slim","exit_code":-1},{"runtime":"python:3.11-alpine","exit_code":1},{"runtime":"python:3.11-slim","exit_code":-1},{"runtime":"python:3.12-alpine","exit_code":1},{"runtime":"python:3.12-slim","exit_code":-1},{"runtime":"python:3.13-alpine","exit_code":1},{"runtime":"python:3.13-slim","exit_code":-1},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":-1}]}}