ipyparallel
ipyparallel provides an architecture for interactive parallel and distributed computing within the IPython ecosystem, allowing users to interact with a cluster of Python kernels. It enables parallel execution of code, distributed data processing, and asynchronous task management. The current version is 9.1.0, and it generally follows a moderate release cadence, with major versions introducing significant features or compatibility changes.
Warnings
- breaking ipyparallel 9.x requires Python 3.10 or newer. Older Python versions are not supported.
- gotcha Connections to the ipyparallel cluster can be blocked by firewalls or incorrect network configurations, especially when engines are on different machines.
- gotcha The Client must connect to the correct profile. If `ipcluster start` was run with `--profile=myprofile`, the client must specify `ipp.Client(profile='myprofile')`.
- gotcha Understanding the difference between `DirectView` (`client[:]`) and `LoadBalancedView` (`client.load_balanced_view()`) is crucial for correct task distribution and performance.
- gotcha Properly managing the cluster lifecycle (`ipcluster start`/`stop`) is essential. Forgetting to stop clusters can leave orphan processes and consume resources.
Install
-
pip install ipyparallel
Imports
- Client
from IPython.parallel import Client
from ipyparallel import Client
- Cluster
from ipyparallel.cluster import Cluster
Quickstart
import ipyparallel as ipp
import time
import os
# NOTE: For this code to run, you must have an ipyparallel cluster running.
# Start one from your terminal using: ipcluster start
# (e.g., 'ipcluster start --n=4' for 4 engines)
try:
# Connect to the default cluster profile
client = ipp.Client()
if not client.ids:
raise RuntimeError("No engines found in the cluster. Ensure 'ipcluster start' is running.")
print(f"Connected to cluster with {len(client.ids)} engines.")
# Get a direct view (all engines, tasks are mapped across them)
dview = client[:]
# Execute a simple parallel task synchronously
def square(x):
time.sleep(0.01) # Simulate some work
return x * x
print("Executing map_sync on range(10)...")
results = dview.map_sync(square, range(10))
print(f"Parallel results (first 5): {results[:5]}...")
# Execute an asynchronous task on each engine
print("Executing async getpid()...")
ar = dview.apply_async(lambda : os.getpid())
pids = ar.get()
print(f"PIDs from engines: {pids}")
except Exception as e:
print(f"Could not connect to ipyparallel cluster or encountered an error: {e}")
print("Please ensure 'ipcluster start' is running and no firewall is blocking access.")