SNAC
raw JSON → 1.2.1 verified Fri May 01 auth: no python
Multi-Scale Neural Audio Codec for audio compression, supporting 24 kHz, 32 kHz, and 44 kHz sampling rates. This is a PyTorch-based library for encoding audio into discrete codes (suitable for language modeling) and decoding back to waveform. Current version 1.2.1 has a stable API with `encode` and `decode` methods.
pip install snac Common errors
error ImportError: cannot import name 'SNAC' from 'snac' ↓
cause SNAC was not installed correctly or an incompatible version is installed.
fix
Ensure you installed the correct package:
pip install snac. Check that you are not shadowing the package with a local file named snac.py. error RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! ↓
cause The model is on one device, but the input tensor is on another.
fix
Move the model and input to the same device:
model = model.to('cuda'); audio = audio.to('cuda'). error OSError: Can't load tokenizer for 'hubertsiuzdak/snac_24khz' ↓
cause The model name is incorrect or the model is not publicly accessible.
fix
Use a valid Hugging Face model ID (e.g., 'hubertsiuzdak/snac_24khz', 'hubertsiuzdak/snac_32khz', 'hubertsiuzdak/snac_44khz') or ensure your network can access huggingface.co.
Warnings
gotcha The `encode` method returns a list of tensors (one per layer) in version <1.2.0, but returns a single stacked tensor in 1.2.0+. Check your version and adjust code accordingly. ↓
fix Upgrade to >=1.2.0 or use `codes = model.encode(audio)` and handle list.
deprecated Loading models from a local filepath was broken in 1.2.0 and fixed in 1.2.1. If you use `SNAC.from_pretrained('./local_model')`, ensure version >=1.2.1. ↓
fix Upgrade to 1.2.1 or use a Hugging Face model ID.
gotcha The model expects audio resampled to the model's sample rate (24kHz, 32kHz, or 44kHz). Failure to resample will produce garbled output. ↓
fix Resample input audio to match the model's sample rate before encoding.
breaking Version 1.0.0 introduced a completely new architecture and model zoo. Models from v0.x (if they existed) are incompatible. ↓
fix Use only v1.x models and upgrade to latest version.
Imports
- SNAC
from snac import SNAC
Quickstart
import torch
from snac import SNAC
model = SNAC.from_pretrained("hubertsiuzdak/snac_24khz")
audio = torch.randn(1, 1, 24000) # 1 second of 24 kHz audio
codes = model.encode(audio)
print("Codes shape:", codes.shape)
reconstructed = model.decode(codes)
print("Audio shape:", reconstructed.shape)