Dask Cloud Provider

2025.9.0 · active · verified Thu Apr 16

Dask Cloud Provider (dask-cloudprovider) is a Python library that enables native cloud integration for Dask. It provides classes for constructing and managing ephemeral Dask clusters on various cloud platforms, including AWS, GCP, Azure, DigitalOcean, Hetzner, IBM Cloud, OpenStack, and Nebius. It also includes plugins that make Dask components cloud-aware. The library aims to simplify the deployment and operation of Dask clusters on the cloud. As of its latest version 2025.9.0, released in September 2025, it is actively maintained with releases published automatically when tags are pushed to GitHub.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates creating an ephemeral Dask cluster on AWS Fargate, connecting a client, running a simple computation, and ensuring resources are properly de-provisioned using a context manager. Users must have their cloud provider credentials configured (e.g., AWS CLI `aws configure` for AWS) for the cluster to provision successfully.

import os
from dask_cloudprovider.aws import FargateCluster
from dask.distributed import Client

# Ensure AWS credentials are configured (e.g., via AWS CLI or env vars)
# For a real deployment, consider setting DASK_CLOUDPROVIDER__AWS__REGION
# and other specifics via environment variables or a Dask config file.
# Example: os.environ['AWS_ACCESS_KEY_ID'] = 'YOUR_ACCESS_KEY'
# os.environ['AWS_SECRET_ACCESS_KEY'] = 'YOUR_SECRET_KEY'
# os.environ['AWS_DEFAULT_REGION'] = 'us-east-1'

try:
    # Create a Dask cluster using AWS Fargate
    # This will provision cloud resources
    # Using a context manager ensures resources are closed automatically
    with FargateCluster(n_workers=1, worker_cpu=1024, worker_memory=2048) as cluster:
        print(f"Dask Dashboard link: {cluster.dashboard_link}")
        
        # Connect a Dask client to the cluster
        client = Client(cluster)
        print("Dask Client connected.")

        # Perform some Dask computation
        futures = client.map(lambda x: x * x, range(10))
        results = client.gather(futures)
        print(f"Computation results: {results}")

        client.close()
        print("Dask Client closed.")
    print("Dask Cluster resources automatically closed (due to context manager).")
except Exception as e:
    print(f"An error occurred: {e}")
    print("Please ensure your AWS credentials are configured and that you have sufficient permissions.")

view raw JSON →