Modal
Serverless cloud infrastructure for AI workloads. Run Python functions on GPUs with sub-second cold starts. v1.0 released May 2025 with multiple breaking API changes. Rapid release cadence — multiple versions per week.
Warnings
- breaking modal.Stub renamed to modal.App. Any code using modal.Stub() raises AttributeError on current versions.
- breaking modal.Mount removed from public API in v1.0. Passing mount= to @app.function or @app.cls raises TypeError.
- breaking Automounting of local Python modules removed in v1.0. Remote containers no longer automatically include local imports.
- breaking Autoscaler parameters renamed in v1.0: keep_warm → min_containers, concurrency_limit → max_containers, container_idle_timeout → scaledown_window.
- breaking @modal.build decorator deprecated in v1.0. Using it for model weight downloads is no longer the recommended pattern.
- deprecated Image.copy_local_dir and Image.copy_local_file deprecated. Replaced by Image.add_local_dir and Image.add_local_file.
- deprecated modal.web_endpoint renamed to modal.fastapi_endpoint. Old name still works but will be removed in a future v1.x release.
- gotcha Modal releases multiple versions per week including pre-release .devN builds. Unpinned installs in CI can pick up pre-release versions unexpectedly.
- gotcha modal.Dict values expire after 7 days of inactivity on the new backend (introduced May 2025). Previously created Dicts use the old backend with no expiry.
Install
-
pip install modal -
modal setup
Imports
- App
import modal app = modal.App()
- fastapi_endpoint
@modal.fastapi_endpoint()
- concurrent
@modal.concurrent(max_inputs=10)
Quickstart
import modal
app = modal.App()
@app.function(
gpu="A10G",
image=modal.Image.debian_slim().pip_install("torch")
)
def run_inference(prompt: str) -> str:
import torch
# your model code here
return "result"
@app.local_entrypoint()
def main():
result = run_inference.remote("hello")
print(result)