dask-cudf-cu12
raw JSON → 26.4.0 verified Mon Apr 27 auth: no python
Utilities for integrating Dask with cuDF on CUDA 12.x. This package provides the distributed DataFrame functionality backed by cuDF, leveraging cuDF's GPU-accelerated columnar operations. Version 26.4.0 requires Python >=3.11 and is part of the RAPIDS 26.04 release. Releases follow a quarterly cadence aligned with RAPIDS.
pip install dask-cudf-cu12 Common errors
error ModuleNotFoundError: No module named 'dask_cudf' ↓
cause Package not installed or wrong variant installed.
fix
pip install dask-cudf-cu12 (for CUDA 12.x) or pip install dask-cudf (for CUDA 11.x).
error AttributeError: module 'dask_cudf' has no attribute 'from_cudf' ↓
cause Old or mismatched version of dask-cudf/cudf.
fix
Upgrade packages: pip install --upgrade dask-cudf-cu12 cudf-cu12
Warnings
breaking dask-cudf-cu12 is CUDA 12.x only. Use dask-cudf for CUDA 11.x or older. Installing the wrong variant for your CUDA version will cause import errors. ↓
fix Check your CUDA version with nvidia-smi. Install dask-cudf-cu12 if CUDA >=12.0, otherwise dask-cudf.
deprecated DataFrame.apply_chunks and Groupby.apply_grouped have been removed since v25.12.00. Use map_partitions or groupby.apply instead. ↓
fix Replace df.apply_chunks(func, ...) with df.map_partitions(func). For grouped operations, use groupby_obj.apply(func, meta=...).
gotcha Conda environment with both dask-cudf-cu12 and dask-cudf leads to import confusion. Pip similarly can mix packages. Only one variant should be installed. ↓
fix Use separate conda environments for CUDA 11.x and 12.x, or pip install only the correct variant.
gotcha dask_cudf.from_cudf() does not automatically repartition data. If the source cuDF DataFrame has too few rows, Dask may underutilize GPUs. ↓
fix Use npartitions parameter explicitly: dask_cudf.from_cudf(df, npartitions=len(gpu_devices)).
Install
conda install -c rapidsai -c conda-forge dask-cudf-cu12 Imports
- dask_cudf
import dask_cudf - DataFrame wrong
from cudf import DataFramecorrectfrom dask_cudf import DataFrame
Quickstart
import dask_cudf
import cudf
# Create a cuDF Series and then a dask_cudf DataFrame
df_cudf = cudf.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})
ddf = dask_cudf.from_cudf(df_cudf, npartitions=2)
print(ddf.compute())