{"library":"torchaudio","title":"TorchAudio","description":"TorchAudio is an open-source library for audio and signal processing with PyTorch, providing functions, datasets, model implementations, and application components for machine learning tasks. While it has transitioned into a maintenance phase since version 2.8/2.9 to reduce redundancies and focus on ML audio processing, it continues to release new versions (currently 2.11.0) in alignment with PyTorch releases.","status":"active","version":"2.11.0","language":"en","source_language":"en","source_url":"https://github.com/pytorch/audio","tags":["audio","deep learning","pytorch","signal processing","machine learning"],"install":[{"cmd":"pip install torchaudio","lang":"bash","label":"Default Install"}],"dependencies":[{"reason":"TorchAudio is built on PyTorch and requires a compatible version.","package":"torch","optional":false},{"reason":"Required for `torchaudio.io` module and the underlying TorchCodec for audio loading/saving. Installation via conda or system package manager is recommended.","package":"ffmpeg","optional":true},{"reason":"Required for Automatic Speech Recognition with Emformer RNN-T.","package":"sentencepiece","optional":true},{"reason":"Required for Text-to-Speech with Tacotron2.","package":"deep-phonemizer","optional":true}],"imports":[{"symbol":"torchaudio","correct":"import torchaudio"},{"symbol":"transforms","correct":"from torchaudio import transforms"},{"symbol":"functional","correct":"from torchaudio import functional"}],"quickstart":{"code":"import torch\nimport torchaudio\nfrom torchaudio import transforms\n\n# Create a dummy waveform (1 channel, 16000 samples at 16kHz)\n# In a real scenario, you would load an audio file: waveform, sample_rate = torchaudio.load(\"path/to/audio.wav\")\nwaveform = torch.randn(1, 16000)\nsample_rate = 16000\n\n# Define a MelSpectrogram transform\nmelspectrogram_transform = transforms.MelSpectrogram(sample_rate=sample_rate, n_mels=128)\n\n# Apply the transform\nmelspectrogram = melspectrogram_transform(waveform)\n\nprint(f\"Waveform shape: {waveform.shape}\")\nprint(f\"MelSpectrogram shape: {melspectrogram.shape}\")\n# Expected output: \n# Waveform shape: torch.Size([1, 16000])\n# MelSpectrogram shape: torch.Size([1, 128, X]) where X depends on n_fft and hop_length\n","lang":"python","description":"This quickstart demonstrates how to import TorchAudio, create a dummy audio waveform, and apply a common audio transformation like MelSpectrogram. In practice, `torchaudio.load` is used to load actual audio files."},"warnings":[{"fix":"Refer to the TorchAudio 2.9 migration guide on the official documentation to identify and update removed APIs. Many functions were consolidated or moved to `torchcodec`.","message":"Breaking API changes: Most APIs explicitly marked as 'drop' were deprecated in TorchAudio 2.8 and subsequently removed in 2.9. This can cause `AttributeError` or `ImportError` if upgrading from older versions.","severity":"breaking","affected_versions":">=2.9.0"},{"fix":"Review calls to `torchaudio.load()` and `torchaudio.save()`. Migrate to native `torchcodec` APIs if you rely on parameters that are now ignored or for improved performance. Ensure `torchcodec` is installed and compatible with your `torch` version.","message":"The `torchaudio.load()` and `torchaudio.save()` functions (since 2.9) now internally rely on the `torchcodec` library. While they maintain a compatible API, some parameters like `normalize`, `buffer_size`, and `backend` are ignored. For optimal performance and full control, it is recommended to directly use `torchcodec.decoders.AudioDecoder` and `torchcodec.encoders.AudioEncoder`.","severity":"gotcha","affected_versions":">=2.9.0"},{"fix":"Always install `torch` and `torchaudio` versions that are explicitly listed as compatible in the official TorchAudio compatibility matrix (e.g., `torch==X.Y.Z` and `torchaudio==X.Y.Z`).","message":"Strict PyTorch Version Compatibility: TorchAudio releases are tightly coupled with specific PyTorch versions. Using mismatched versions of `torch` and `torchaudio` will lead to runtime errors, particularly with C++ extensions.","severity":"breaking","affected_versions":"All versions"},{"fix":"Ensure FFmpeg is installed and discoverable by your system. For `conda` environments, `conda install -c conda-forge 'ffmpeg<8'` (or a suitable version) is often effective. Refer to `torchcodec` installation instructions for detailed FFmpeg compatibility.","message":"FFmpeg dependency for I/O: TorchAudio's audio loading and saving functionalities, particularly through the `torchcodec` backend, heavily rely on FFmpeg being installed and accessible on your system. Missing or incompatible FFmpeg versions can lead to `RuntimeError` during audio processing.","severity":"gotcha","affected_versions":"All versions, especially >=2.9.0"}],"env_vars":null,"last_verified":"2026-04-05T00:00:00.000Z","next_check":"2026-07-04T00:00:00.000Z"}