Stable Audio Tools
raw JSON → 0.0.19 verified Fri May 01 auth: no python
A Python library by Stability AI for training and inference with generative audio models, including Stable Audio and Dance Diffusion. Current version 0.0.19. Active development with frequent updates.
pip install stable-audio-tools Common errors
error ModuleNotFoundError: No module named 'stable_audio_tools' ↓
cause Library not installed or installed in wrong environment.
fix
Run
pip install stable-audio-tools in the correct Python environment. error KeyError: 'stable-audio-open-1.0' ↓
cause Model name does not exist or misspelled.
fix
Check available models with
get_models() and use the exact key string. error RuntimeError: Expected all tensors to be on the same device, but found at least two devices ↓
cause Model and input tensors on different devices (CPU/GPU).
fix
Ensure model and all inputs are moved to the same device with
.to(device). Warnings
gotcha The library is in early development (v0.0.19). APIs may change without notice. Always pin your dependency version. ↓
fix Install with `pip install stable-audio-tools==0.0.19` and watch GitHub for updates.
deprecated The old import path `from stable_audio_tools.models import get_models` is deprecated in favor of `from stable_audio_tools import get_models`. ↓
fix Use `from stable_audio_tools import get_models`.
gotcha Model names supplied to `get_pretrained_model_and_config` must match exactly the keys from `get_models()`. Case and hyphen sensitive. ↓
fix Always inspect the list returned by `get_models()` to get exact names.
Imports
- get_models
from stable_audio_tools import get_models - create_model_from_config
from stable_audio_tools.interface import create_model_from_config - ModelConfig
from stable_audio_tools.interface import ModelConfig - get_pretrained_model_and_config
from stable_audio_tools.interface import get_pretrained_model_and_config
Quickstart
import torch
import soundfile as sf
from stable_audio_tools import get_models
from stable_audio_tools.interface import get_pretrained_model_and_config
# List available models
models = get_models()
print("Available models:", list(models.keys()))
# Use a stable audio model (replace with actual model name from list)
model_name = "stable-audio-open-1.0" # example, check get_models()
model, config = get_pretrained_model_and_config(model_name)
# Generate audio: text-to-audio (simplified, requires proper sampling setup)
# Create a random latent and decode (demo only)
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)
# Note: Full generation requires T5 text encoder and diffusion loop
print("Model loaded successfully. Refer to official docs for full inference.")