TorchX SDK and Components

0.7.0 · active · verified Thu Apr 16

TorchX is a Python SDK for MLOps that helps you compose, configure, and launch PyTorch applications on various schedulers like local, Docker, Kubernetes, and Ray. It provides a common API for distributed training, serving, and other ML workloads. The current version is 0.7.0, with major releases occurring every few months.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to define a basic `AppDef` with a single role and launch it using `TorchxRunner` on the `local_cwd` scheduler. It prints 'Hello, TorchX!' to the console via the `echo` command.

from torchx import specs
from torchx.runner import TorchxRunner

# Define a simple application
app = specs.AppDef(
    name='hello-world',
    roles=[
        specs.Role(
            name='worker',
            entrypoint='echo',
            args=['Hello, TorchX!'],
            num_replicas=1,
            resource=specs.Resource(cpu=1, memMB=512)
        )
    ]
)

# Initialize the TorchX runner
runner = TorchxRunner()

# Launch the application on the local_cwd scheduler
# Ensure you have a 'local_cwd' scheduler configured or just use 'local_cwd'
app_handle = runner.run(app, scheduler='local_cwd')
print(f"Application '{app.name}' launched with handle: {app_handle}")

# You can optionally wait for the application to complete
# runner.wait(app_handle, timeout=300) # Waits up to 5 minutes
# print(f"Application '{app.name}' completed.")

view raw JSON →