{"id":8062,"library":"dask-cuda","title":"Dask-CUDA","description":"Dask-CUDA is a Python library providing utilities to facilitate interactions between Dask and NVIDIA CUDA-enabled GPUs. It extends `dask.distributed`'s `LocalCluster` and `Worker` to manage and deploy Dask workers efficiently on GPU systems. Key features include automatic instantiation of per-GPU workers, setting CPU affinity for optimal performance, and robust GPU memory management, including spilling to host memory. It is a core component of the RAPIDS suite for GPU-accelerated data science. The library maintains an active development status with regular releases, currently at version 26.4.0.","status":"active","version":"26.4.0","language":"en","source_language":"en","source_url":"https://github.com/rapidsai/dask-cuda","tags":["dask","cuda","gpu","distributed-computing","rapids","data-science","python"],"install":[{"cmd":"pip install dask-cuda","lang":"bash","label":"PyPI (latest CUDA)"},{"cmd":"pip install 'dask-cuda[cu12]' # for CUDA 12\npip install 'dask-cuda[cu13]' # for CUDA 13","lang":"bash","label":"PyPI (specific CUDA version)"},{"cmd":"conda install -c rapidsai -c conda-forge dask-cuda cuda-version=13.1","lang":"bash","label":"Conda (example for CUDA 13.1)"}],"dependencies":[{"reason":"Core distributed computing library that dask-cuda extends.","package":"dask"},{"reason":"Dask's cluster management library, essential for dask-cuda's functionality.","package":"distributed"},{"reason":"GPU DataFrame library, frequently used with dask-cuda for data processing (optional, for dask-cudf workflows).","package":"cudf","optional":true},{"reason":"GPU array library, often used with dask-cuda for array computations (optional, for dask-array GPU workflows).","package":"cupy","optional":true},{"reason":"Runtime dependency for CUDA JIT compilation utilities. The base 'numba' package was removed as a direct dependency in v26.04.00, but numba-cuda remains.","package":"numba-cuda","optional":true},{"reason":"RAPIDS Memory Manager, highly recommended for efficient GPU memory pooling and spilling.","package":"rmm","optional":true},{"reason":"Provides UCX communication support, replacing legacy UCX-Py integration (required for UCX-accelerated communication).","package":"distributed-ucxx","optional":true}],"imports":[{"symbol":"LocalCUDACluster","correct":"from dask_cuda import LocalCUDACluster"},{"symbol":"Client","correct":"from dask.distributed import Client"}],"quickstart":{"code":"import os\nfrom dask_cuda import LocalCUDACluster\nfrom dask.distributed import Client\n\nif __name__ == \"__main__\":\n    # Recommended to run inside an if __name__ == \"__main__\": block\n    # Configure for 2 GPUs, 90% RMM pool size, and enable cuDF spilling\n    cluster = LocalCUDACluster(\n        CUDA_VISIBLE_DEVICES=\"0,1\",  # Example: use devices 0 and 1\n        rmm_pool_size=0.9,           # Use 90% of GPU memory as a pool\n        enable_cudf_spill=True,      # Enable spilling to host memory if needed\n        local_directory=os.environ.get('DASK_LOCAL_DIRECTORY', '/tmp/dask-cuda')\n    )\n    client = Client(cluster)\n\n    print(f\"Dask-CUDA cluster dashboard link: {client.dashboard_link}\")\n    # Your Dask-accelerated GPU computations go here\n    # For example, with dask-cudf:\n    # import dask.dataframe as dd\n    # import cudf\n    # dask.config.set({\"dataframe.backend\": \"cudf\"})\n    # ddf = dd.read_csv(\"my_gpu_data.csv\")\n    # result = ddf.groupby(\"col\").sum().compute()\n\n    client.close()\n    cluster.close()","lang":"python","description":"This quickstart demonstrates how to set up a `LocalCUDACluster` and connect a `dask.distributed.Client`. It highlights common configurations like specifying visible GPUs, configuring RAPIDS Memory Manager (RMM) for memory pooling, and enabling cuDF spilling to prevent out-of-memory errors on large datasets. The use of an `if __name__ == \"__main__\":` block is crucial for standalone scripts."},"warnings":[{"fix":"Remove `numba` from your environment if it was installed as a direct `dask-cuda` dependency. Ensure `numba-cuda` is installed if you rely on CUDA JIT features.","message":"The `numba` package is no longer a direct dependency as of v26.04.00. While `numba-cuda` remains relevant for CUDA JIT compilation, direct usage or reliance on the base `numba` package within `dask-cuda` might lead to issues.","severity":"breaking","affected_versions":">=26.04.00"},{"fix":"Install `distributed-ucxx` (e.g., `pip install distributed-ucxx` or `conda install distributed-ucxx -c conda-forge`) and update Dask configuration keys from `distributed.comm.ucx.*` to `distributed-ucxx.*` if previously configured manually.","message":"Support for the `UCX-Py` library has been removed in favor of `distributed-ucxx` starting from v25.10.00. Direct use of `protocol='ucx'` may fail without the new `distributed-ucxx` package.","severity":"breaking","affected_versions":">=25.10.00"},{"fix":"Upgrade your NVIDIA CUDA Toolkit to a supported version (e.g., CUDA 12.x or 13.x) or use an older `dask-cuda` version that supports CUDA 11.","message":"CUDA 11 support was removed from dependencies starting with v25.08.00. Users on CUDA 11 might experience compatibility issues or build failures.","severity":"breaking","affected_versions":">=25.08.00"},{"fix":"Adopt the recommended Dask DataFrame API for GPU acceleration by setting `dask.config.set({\"dataframe.backend\": \"cudf\"})` after importing `dask` and ensuring `cudf` is installed.","message":"Legacy `Dask-cuDF` handling was removed in v25.02.00a. Older patterns for integrating `dask-cudf` might no longer work as expected.","severity":"breaking","affected_versions":">=25.02.00"},{"fix":"Always wrap your `LocalCUDACluster` and `Client` creation in an `if __name__ == \"__main__\":` block when running as a script.","message":"When using `LocalCUDACluster` in a standalone Python script, it is crucial to enclose the cluster and client initialization within an `if __name__ == \"__main__\":` block. Failure to do so can lead to unexpected behavior, deadlocks, or errors related to subprocess spawning.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Install `distributed-ucxx` (e.g., `pip install distributed-ucxx`) and ensure your Dask configuration is updated to use the new package.","cause":"The `ucx` protocol for Dask `distributed` requires the `distributed-ucxx` package, and `UCX-Py` support was deprecated.","error":"KeyError: 'protocol ucx does not exist'"},{"fix":"Initialize `LocalCUDACluster` with `rmm_pool_size` (e.g., `rmm_pool_size=0.9`) and `enable_cudf_spill=True` to leverage RMM for memory pooling and spill to host memory if necessary.","cause":"GPU memory limits are exceeded during computation. This is common when not using RAPIDS Memory Manager (RMM) or when spilling is not enabled/configured properly.","error":"CUDA_ERROR_OUT_OF_MEMORY: out of memory"},{"fix":"Ensure NVIDIA drivers are installed and functional. If running on a multi-GPU system, explicitly set `CUDA_VISIBLE_DEVICES` in your environment or pass it to `LocalCUDACluster` (e.g., `LocalCUDACluster(CUDA_VISIBLE_DEVICES='0,1')`).","cause":"Dask workers are not detecting available GPUs, possibly due to `CUDA_VISIBLE_DEVICES` environment variable not being set correctly or NVIDIA drivers/CUDA Toolkit issues.","error":"dask.distributed.worker - WARNING - No NVIDIA devices detected"},{"fix":"Set `dask.config.set({\"dataframe.backend\": \"cudf\"})` after importing `dask` to instruct Dask DataFrame to use cuDF as its backend, provided `cudf` is installed.","cause":"Attempting to use `dask-cudf` specific methods on a generic `dask.dataframe` without explicitly setting the backend to 'cudf'.","error":"AttributeError: 'DataFrame' object has no attribute 'cudf'"}]}