SkyPilot

0.12.0 · active · verified Sun Apr 12

SkyPilot is an open-source framework that allows users to run AI workloads on any cloud, simplifying multi-cloud deployments, optimizing costs, and improving performance. It provides a unified interface for launching, managing, and scaling GPU clusters and jobs across AWS, GCP, Azure, Kubernetes, and more. SkyPilot is actively developed, with frequent patch releases and major feature updates every few weeks to months. The current stable version is 0.12.0.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to define a basic SkyPilot task using the Python SDK and launch it on a temporary cluster. SkyPilot automatically provisions the necessary resources on an available cloud provider (e.g., AWS, GCP) based on your credentials and the task's requirements (e.g., number of GPUs). It is crucial that your chosen cloud provider's CLI is installed and configured with valid credentials locally (e.g., `aws configure`, `gcloud auth login`) before running SkyPilot commands.

import sky

# Define a simple task to run on a cluster
# This task will echo a message and try to list a GPU (if available)
task = sky.Task(
    'echo "Hello from SkyPilot on $(hostname) with GPU $(nvidia-smi -L)"',
    workdir='.',
    num_gpus=1, # Request 1 GPU. SkyPilot will find a cloud with a GPU instance.
    setup='pip install pandas' # Example setup; replace with your actual dependencies
)

# Launch a cluster named 'my-quickstart-cluster' and run the task.
# SkyPilot will provision resources on a chosen cloud (e.g., AWS, GCP)
# based on your configured cloud credentials and resource availability.
# Ensure your cloud provider CLIs (e.g., AWS CLI, gcloud CLI) are installed and authenticated.
print("Launching SkyPilot cluster (this may take a few minutes)...")
sky.launch(task, cluster_name='my-quickstart-cluster')

print("Cluster launched and task executed. Run `sky status` to check.")
# To tear down the cluster after use:
# print("Tearing down cluster...")
# sky.down('my-quickstart-cluster')
# print("Cluster torn down.")

view raw JSON →