{"id":7119,"library":"cupy-cuda13x","title":"CuPy (CUDA 13.x)","description":"CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated computing with Python, acting as a drop-in replacement for existing NumPy/SciPy code on NVIDIA CUDA platforms. It leverages CUDA Toolkit libraries like cuBLAS and cuFFT for significant speedups in numerical computations on GPUs. The current version is 14.0.1, and major releases occur less frequently (e.g., v14 was the first in two years), with minor and revision updates more common.","status":"active","version":"14.0.1","language":"en","source_language":"en","source_url":"https://github.com/cupy/cupy","tags":["GPU","NumPy","SciPy","CUDA","array","scientific-computing"],"install":[{"cmd":"pip install cupy-cuda13x","lang":"bash","label":"Basic Installation (requires system CUDA Toolkit)"},{"cmd":"pip install 'cupy-cuda13x[ctk]'","lang":"bash","label":"Installation with CUDA Toolkit Python packages (driver only needed)"}],"dependencies":[{"reason":"Required hardware with Compute Capability 3.0 or larger for GPU acceleration.","package":"NVIDIA CUDA GPU","optional":false},{"reason":"Required for compiling and running CUDA kernels. This specific wheel targets CUDA 13.x. Can be avoided with `[ctk]` extra if using PyPI CUDA components.","package":"NVIDIA CUDA Toolkit 13.x","optional":true},{"reason":"CuPy is NumPy-compatible and relies on NumPy's API and structure.","package":"numpy","optional":false},{"reason":"Optional for SciPy-compatible functions via `cupyx.scipy`.","package":"scipy","optional":true},{"reason":"Required for `bfloat16` data type support introduced in CuPy v14.","package":"ml_dtypes","optional":true},{"reason":"Optional for additional cuTENSOR library features.","package":"cutensor-cu13","optional":true},{"reason":"Optional for additional NCCL library features (multi-GPU/multi-node collective operations).","package":"nvidia-nccl-cu13","optional":true}],"imports":[{"note":"Standard convention for importing CuPy.","symbol":"cupy","correct":"import cupy as cp"},{"note":"SciPy-compatible functions are located under the `cupyx.scipy` submodule, not directly under `cupy.scipy`.","wrong":"import cupy.scipy","symbol":"cupyx.scipy","correct":"import cupyx.scipy as cpxs"}],"quickstart":{"code":"import cupy as cp\nimport numpy as np\n\n# Check if a GPU is available\nif cp.cuda.is_available():\n    print(f\"CuPy is available. Current device: {cp.cuda.Device().id}\")\n\n    # Create a CuPy array on the GPU\n    x_gpu = cp.arange(10, dtype=cp.float32).reshape(2, 5)\n    print(f\"GPU array:\\n{x_gpu}\")\n    print(f\"Type of GPU array: {type(x_gpu)}\")\n\n    # Perform a computation on the GPU\n    y_gpu = x_gpu * 2 + 1\n    print(f\"Result of computation on GPU:\\n{y_gpu}\")\n\n    # Transfer the result back to CPU NumPy array\n    y_cpu = cp.asnumpy(y_gpu)\n    print(f\"CPU array (from GPU):\\n{y_cpu}\")\n    print(f\"Type of CPU array: {type(y_cpu)}\")\n\n    # Demonstrate a simple NumPy-like operation\n    sum_gpu = x_gpu.sum(axis=1)\n    print(f\"Sum along axis 1 on GPU: {sum_gpu}\")\n    print(f\"Type of sum on GPU: {type(sum_gpu)}\")\n\n    # Ensure all GPU operations complete before proceeding (useful for timing)\n    cp.cuda.Stream.null.synchronize()\nelse:\n    print(\"No NVIDIA GPU found or CuPy is not properly installed for CUDA.\")\n    print(\"Falling back to NumPy for demonstration.\")\n    x_cpu = np.arange(10, dtype=np.float32).reshape(2, 5)\n    print(f\"CPU array:\\n{x_cpu}\")","lang":"python","description":"This quickstart demonstrates basic CuPy array creation, arithmetic operations on the GPU, transferring data between GPU and CPU, and performing a NumPy-like aggregation. It includes a check for GPU availability and the use of `cp.cuda.Stream.null.synchronize()` for explicit GPU synchronization, which is important for accurate performance measurement."},"warnings":[{"fix":"Review and adapt code for NumPy 2 compatibility. Refer to NumPy 2 and CuPy v14 release notes for detailed changes.","message":"CuPy v14 aligns its behavior with NumPy 2 semantics, which includes changes to type promotion rules and casting behavior. Code relying on older NumPy 1.x type promotion might behave differently.","severity":"breaking","affected_versions":"14.x.x and above"},{"fix":"Migrate any cuDNN-dependent code to use `cuDNN Frontend` directly or other libraries that wrap cuDNN functionality.","message":"CuPy v14 has completely removed all cuDNN-related functionality. Direct usage of `cupy.cuda.cudnn` will fail.","severity":"breaking","affected_versions":"14.x.x and above"},{"fix":"Upgrade to CUDA Toolkit 12.x or 13.x and Python 3.10 or newer.","message":"Support for CUDA 11 and Python 3.9 has been dropped in CuPy v14. Users on these older environments must upgrade.","severity":"breaking","affected_versions":"14.x.x and above"},{"fix":"Ensure your system's CUDA Toolkit version (specifically the driver) matches the `cupy-cudaXXx` package you install. For easier setup without a full system CUDA Toolkit, use `pip install 'cupy-cuda13x[ctk]'` to install PyPI-distributed CUDA components.","message":"Installing `cupy-cuda13x` requires a compatible NVIDIA CUDA Toolkit 13.x installation or driver. Mismatches in CUDA versions between the installed CuPy wheel and the system's CUDA Toolkit can lead to `ImportError` or runtime compilation errors.","severity":"gotcha","affected_versions":"All versions tied to specific CUDA major versions (e.g., cupy-cuda13x)"},{"fix":"This is expected behavior and typically not an issue for repeated operations. For performance-critical loops, ensure initialization or a 'warm-up' run occurs outside the timed section.","message":"Initial execution of CuPy functions can be slower than subsequent calls due to just-in-time compilation and caching of CUDA kernels.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Call `cp.cuda.Stream.null.synchronize()` after the GPU computation and before measuring time or accessing results on the CPU.","message":"GPU operations in CuPy are asynchronous by default. For accurate timing of GPU execution in benchmarks or to ensure operations complete before host interaction, explicit synchronization is necessary.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Ensure you are in the correct virtual environment. If CuPy was just installed, restart your Python script, IDE, or terminal to refresh environment variables. Verify installation with `pip freeze | grep cupy`.","cause":"CuPy was either not installed in the active Python environment, or environment variables (like PATH) were not reloaded after installation, particularly when installing a CUDA Toolkit.","error":"ModuleNotFoundError: No module named 'cupy'"},{"fix":"Convert the NumPy array to a CuPy array using `cp.asarray()` or `cp.array()` before passing it to CuPy functions. Example: `gpu_array = cp.asarray(numpy_array)`.","cause":"Attempting to pass a NumPy array (CPU-resident) directly to a CuPy function that expects a CuPy array (GPU-resident).","error":"TypeError: Argument 'x' has incorrect type (expected cupy.core.core.ndarray, got numpy.ndarray)"},{"fix":"Verify your CUDA Toolkit installation. Ensure `CUDA_PATH` or `LD_LIBRARY_PATH` are set if CUDA is in a non-standard location. If using PyPI `[ctk]` installation, make sure the `nvidia-cuda-runtime-cuXX` package is correctly installed to provide headers. You might need to explicitly install `cuda-cudart-dev-12-X` (for CUDA 12) or similar `cuda-cudart-dev-13-X` for CUDA 13.","cause":"CuPy's CUDA compiler (NVRTC) cannot find necessary CUDA header files, often due to an incorrect or incomplete CUDA Toolkit installation, or an environment variable (`CUDA_PATH`, `LD_LIBRARY_PATH`) not being set correctly.","error":"cupy.cuda.compiler.CompileException: nvrtc: error: failed to load builtins; catastrophic error: cannot open source file \"cuda_fp16.h\""},{"fix":"Explicitly convert the CuPy array to a NumPy array using `cupy.asnumpy()` or the `.get()` method. For example: `cpu_array = gpu_array.get()` or `cpu_array = cp.asnumpy(gpu_array)`.","cause":"Attempting to implicitly convert a CuPy array to a NumPy array in contexts where explicit conversion is required, such as direct interaction with NumPy-only functions or printing large arrays.","error":"TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy array explicitly."}]}