{"id":8725,"library":"torchx","title":"TorchX SDK and Components","description":"TorchX is a Python SDK for MLOps that helps you compose, configure, and launch PyTorch applications on various schedulers like local, Docker, Kubernetes, and Ray. It provides a common API for distributed training, serving, and other ML workloads. The current version is 0.7.0, with major releases occurring every few months.","status":"active","version":"0.7.0","language":"en","source_language":"en","source_url":"https://github.com/pytorch/torchx","tags":["pytorch","distributed-training","mlops","orchestration","job-scheduler","ray","kubernetes"],"install":[{"cmd":"pip install torchx","lang":"bash","label":"Stable release"}],"dependencies":[{"reason":"Core dependency for defining and running PyTorch applications.","package":"torch","optional":false},{"reason":"Commonly used with TorchX for computer vision applications.","package":"torchvision","optional":false},{"reason":"Required for using the 'local_docker' scheduler.","package":"docker","optional":true}],"imports":[{"symbol":"TorchxRunner","correct":"from torchx.runner import TorchxRunner"},{"symbol":"AppDef","correct":"from torchx.specs import AppDef"},{"symbol":"Role","correct":"from torchx.specs import Role"},{"symbol":"Resource","correct":"from torchx.specs import Resource"},{"note":"Components are typically accessed as submodules, e.g., `torchx.components.dist.ddp`.","wrong":"from torchx.components import ddp","symbol":"ddp","correct":"import torchx.components.dist.ddp"}],"quickstart":{"code":"from torchx import specs\nfrom torchx.runner import TorchxRunner\n\n# Define a simple application\napp = specs.AppDef(\n    name='hello-world',\n    roles=[\n        specs.Role(\n            name='worker',\n            entrypoint='echo',\n            args=['Hello, TorchX!'],\n            num_replicas=1,\n            resource=specs.Resource(cpu=1, memMB=512)\n        )\n    ]\n)\n\n# Initialize the TorchX runner\nrunner = TorchxRunner()\n\n# Launch the application on the local_cwd scheduler\n# Ensure you have a 'local_cwd' scheduler configured or just use 'local_cwd'\napp_handle = runner.run(app, scheduler='local_cwd')\nprint(f\"Application '{app.name}' launched with handle: {app_handle}\")\n\n# You can optionally wait for the application to complete\n# runner.wait(app_handle, timeout=300) # Waits up to 5 minutes\n# print(f\"Application '{app.name}' completed.\")","lang":"python","description":"This quickstart demonstrates how to define a basic `AppDef` with a single role and launch it using `TorchxRunner` on the `local_cwd` scheduler. It prints 'Hello, TorchX!' to the console via the `echo` command."},"warnings":[{"fix":"Migrate from `ddp_scheduler` to `ray_scheduler` or `kubernetes` scheduler for distributed training applications.","message":"The `torchx.schedulers.ddp_scheduler` module was removed and replaced with the `ray_scheduler` for DDP-like workloads.","severity":"breaking","affected_versions":">=0.7.0"},{"fix":"Update your CLI scripts to use `torchx launch` instead of `torchx run`.","message":"The CLI command `torchx run` was renamed to `torchx launch`.","severity":"breaking","affected_versions":">=0.6.0"},{"fix":"Remove the `scheduler` argument from `specs.Role` definitions. Pass the desired scheduler string directly to `TorchxRunner.run(app, scheduler='your_scheduler')`.","message":"The `scheduler` field was removed from `specs.Role`. Scheduler selection now happens at the `runner.run()` level.","severity":"breaking","affected_versions":">=0.5.0"},{"fix":"Ensure Docker Desktop or the Docker daemon is installed and actively running on your system before attempting to launch jobs with Docker-dependent schedulers.","message":"Using Docker-based schedulers (e.g., `local_docker`, `kubernetes` with container images) requires a running Docker daemon.","severity":"gotcha","affected_versions":"All"},{"fix":"When using components, pass their output directly to `runner.run()`, e.g., `runner.run(torchx.components.dist.ddp_torchscript(...), ...)` rather than trying to `import ddp` directly.","message":"TorchX components like `dist.ddp` are factory functions that return `AppDef` objects, not direct applications or classes.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Update your code to use `ray_scheduler` for DDP workloads, or use a general scheduler like Kubernetes with appropriate configurations. For example, `runner.run(app, scheduler='ray_scheduler')`.","cause":"The `ddp_scheduler` module was removed in TorchX v0.7.0.","error":"ModuleNotFoundError: No module named 'torchx.schedulers.ddp_scheduler'"},{"fix":"Replace `torchx run` with `torchx launch` in your command-line invocations.","cause":"The `torchx run` CLI command was renamed to `torchx launch` in v0.6.0.","error":"Error: No such command 'run'"},{"fix":"Remove the `scheduler` argument from your `specs.Role` constructor. Specify the scheduler via `TorchxRunner.run(app, scheduler='...')` instead.","cause":"The `scheduler` argument was removed from `torchx.specs.Role` in v0.5.0.","error":"TypeError: Role() got an unexpected keyword argument 'scheduler'"},{"fix":"Start the Docker daemon on your machine. For Linux, ensure the Docker service is running (`sudo systemctl start docker`). For Docker Desktop, ensure the application is open and running.","cause":"Attempting to use a Docker-based scheduler (e.g., `local_docker`) without a running Docker daemon.","error":"RuntimeError: Docker client is not available or daemon is not running. Please ensure docker is installed and running."}]}