Dask Distributed

2026.3.0 · active · verified Thu Apr 09

Distributed is the scheduler for Dask, providing a distributed computation engine for parallel and out-of-core analytics. It manages computations across a cluster of machines, featuring dynamic task scheduling, fault tolerance, and diagnostics. The library is actively maintained with frequent releases, typically on a monthly or bi-monthly schedule, following a `YYYY.MM.patch` versioning scheme. The current version is 2026.3.0.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to set up a local Dask cluster using `LocalCluster` and connect a `Client` to it. It then submits a simple `inc` function for parallel execution and gathers the results. The use of context managers (`with`) is crucial for proper resource management and cleanup. A dashboard link is printed if `bokeh` is installed.

from distributed import Client, LocalCluster
import time

def inc(x):
    time.sleep(0.1) # Simulate work
    return x + 1

# Start a local Dask cluster
# Using 'with' statement ensures proper cleanup
with LocalCluster(n_workers=2, threads_per_worker=2, dashboard_address=':8787') as cluster:
    print(f"Dashboard available at: {cluster.dashboard_link}")

    # Connect a client to the cluster
    with Client(cluster) as client:
        print(f"Client connected to: {client.scheduler.address}")

        # Submit tasks
        futures = client.map(inc, range(10))

        # Gather results
        results = client.gather(futures)

        print(f"Results: {results}")
        assert results == [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

    print("Client disconnected.")
print("LocalCluster closed.")

view raw JSON →