audiocraft - Audio Generation

1.3.0 · active · verified Fri Apr 17

Audiocraft is a research library from Facebook AI for state-of-the-art audio generation, including models like MusicGen and AudioGen. It is built on PyTorch, providing tools for both model inference and training. Currently at version 1.3.0, it sees active development with new releases roughly every 1-3 months, often coinciding with new model research.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to load a pretrained MusicGen model, generate an 8-second audio clip based on a text description, and save it to a WAV file. Ensure you have sufficient disk space and a stable internet connection for the initial model download. GPU is highly recommended for faster generation.

from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write
import torch # For moving to CPU if needed

# Load a pretrained MusicGen model ('small' is generally recommended for quick tests)
# This will download model weights (~2GB for 'small'). Ensure stable internet and disk space.
# Specify device if needed: model = MusicGen.get_pretrained('small', device='cuda')
# Ensure you have a compatible PyTorch/CUDA setup for GPU usage.
model = MusicGen.get_pretrained('small')
model.set_generation_params(duration=8) # Generate 8 seconds of audio

# Define a description for the music
description = "a retro synthwave track with a driving beat"

print(f"Generating audio for: '{description}'...")
# The generate method takes a list of descriptions. For unconditional generation, pass descriptions=None.
samples = model.generate(descriptions=[description], progress=True)

# Save the generated audio to a WAV file
# `samples` is a torch.Tensor. It's good practice to move to CPU before saving if it's on GPU.
audio_write(
    'my_synthwave_track',
    samples[0].cpu(), # Take the first generated sample and move to CPU
    model.sample_rate,
    strategy="loudness",
    loudness_compressor=True # Recommended for better audio quality
)

print("Audio saved as 'my_synthwave_track.wav'")

view raw JSON →