Dask Kubernetes
dask-kubernetes provides native integration for Dask with Kubernetes, allowing users to deploy and manage Dask clusters programmatically using the Python API (KubeCluster) or declaratively using Kubernetes Custom Resources (Dask Operator). The current version is 2026.3.0, and it follows a rapid release cadence, often monthly or quarterly, in alignment with the broader Dask ecosystem.
Common errors
-
kubernetes.client.rest.ApiException: (403) Reason: Forbidden
cause The Kubernetes service account used by dask-kubernetes (or the user's `kubectl` context) lacks the necessary RBAC permissions to create or manage resources in the specified namespace.fixVerify that your `kubectl` context is correctly configured and has the required permissions. If running `KubeCluster` inside a Kubernetes pod, ensure the pod's `serviceAccountName` has appropriate `RoleBindings` to access resources (pods, services, deployments) in its namespace. Consult Dask Kubernetes RBAC documentation for minimum required permissions. -
ValueError: No Kubernetes context found. Either provide a config file, or run within a Kubernetes Pod.
cause dask-kubernetes could not find a valid Kubernetes configuration (kubeconfig file) or is not running inside a Kubernetes cluster environment.fixEnsure your `~/.kube/config` file is correctly set up and points to a running Kubernetes cluster. If running locally, confirm `kubectl config current-context` returns a valid context. If deploying within a Kubernetes pod, ensure it has access to the cluster's service account tokens. -
Waiting for Dask scheduler pod to be ready... (TimeoutError or similar message indicating pod failure)
cause The Dask scheduler pod failed to start or become ready. Common reasons include incorrect image name/tag, insufficient resource requests/limits, network policy restrictions, or a misconfigured `DaskCluster` CRD.fixCheck `kubectl describe pod <scheduler-pod-name>` and `kubectl logs <scheduler-pod-name>` for error messages. Verify the Docker image path and tag are correct and accessible. Review resource requests/limits defined in your `KubeCluster` arguments or `DaskCluster` YAML. Ensure no network policies are blocking communication to/from the scheduler. -
ModuleNotFoundError: No module named 'dask_kubernetes'
cause The `dask-kubernetes` library is not installed in the current Python environment.fixRun `pip install dask-kubernetes` in your active Python environment. If using virtual environments, ensure you've activated the correct one.
Warnings
- breaking Support for Python 3.9 was dropped in version 2024.8.0. Users on Python 3.9 must upgrade their Python environment.
- gotcha dask-kubernetes offers two main deployment strategies: `KubeCluster` (programmatic client-side) and the Dask Kubernetes Operator (CRD-based, declarative via YAML). Users often confuse these or apply inappropriate configurations.
- breaking The minimum required version for `kopf` (a core dependency for the operator) was bumped to `1.38.0` in `dask-kubernetes==2025.7.0`.
- breaking The minimum required version for `kr8s` (a core dependency for Kubernetes interaction) was bumped to `0.20.*` in `dask-kubernetes==2025.4.0`.
- gotcha Common errors arise from insufficient Kubernetes RBAC (Role-Based Access Control) permissions for the service account used by Dask pods. This can prevent pod creation, service exposure, or resource scaling.
Install
-
pip install dask-kubernetes
Imports
- KubeCluster
from dask_kubernetes import KubeCluster
- DaskKubernetesOperator
from dask_kubernetes import DaskKubernetesOperator
from dask_kubernetes.operator import DaskKubernetesOperator
Quickstart
from dask_kubernetes import KubeCluster
from dask.distributed import Client
# Ensure you have a kubectl context configured for a running Kubernetes cluster.
# KubeCluster automatically detects the current context.
# Create a Dask cluster on Kubernetes
cluster = KubeCluster(name="my-dask-cluster", n_workers=3)
print(f"Dashboard link: {cluster.dashboard_link}")
# Connect a Dask client to the cluster
client = Client(cluster)
# Perform a simple computation
def inc(x): return x + 1
def add(x, y): return x + y
futures = client.map(inc, range(10))
total = client.submit(add, *futures)
print(f"Result of computation: {total.result()}")
# Scale the cluster (optional)
cluster.scale(5)
print(f"Cluster scaled to {len(cluster.workers)} workers.")
# Close the client and cluster when done
client.close()
cluster.close()
print("Dask cluster and client closed.")