{"id":2811,"library":"torch-audiomentations","title":"Torch Audiomentations","description":"torch-audiomentations is a PyTorch library for audio data augmentation, designed for deep learning workflows. It offers fast, GPU-compatible transforms for batches of multichannel or mono audio, extending `nn.Module` for seamless integration into neural network models. The library is currently at version 0.12.0 and receives active updates with frequent releases.","status":"active","version":"0.12.0","language":"en","source_language":"en","source_url":"https://github.com/asteroid-team/torch-audiomentations","tags":["audio","augmentation","pytorch","deep learning","speech","sound"],"install":[{"cmd":"pip install torch-audiomentations","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Required for audio processing utilities.","package":"julius","optional":false},{"reason":"Core PyTorch dependency for tensor operations and neural network modules.","package":"torch","optional":false},{"reason":"Required for pitch shifting functionality.","package":"torch-pitch-shift","optional":false},{"reason":"Required for audio I/O and transformations, especially after `librosa` removal.","package":"torchaudio","optional":false},{"reason":"Optional dependency for loading augmentation configurations from YAML files.","package":"PyYAML","optional":true}],"imports":[{"note":"The PyPI package name uses a hyphen, but the Python import path uses an underscore.","wrong":"from torch-audiomentations import Compose","symbol":"Compose","correct":"from torch_audiomentations import Compose"},{"symbol":"Gain","correct":"from torch_audiomentations import Gain"},{"symbol":"PolarityInversion","correct":"from torch_audiomentations import PolarityInversion"}],"quickstart":{"code":"import torch\nimport os\nfrom torch_audiomentations import Compose, Gain, PolarityInversion\n\n# Initialize augmentation callable\napply_augmentation = Compose(\n    transforms=[\n        Gain(min_gain_in_db=-15.0, max_gain_in_db=5.0, p=0.5),\n        PolarityInversion(p=0.5)\n    ]\n)\n\ntorch_device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n\n# Make an example tensor with white noise.\n# This tensor represents 8 audio snippets with 2 channels (stereo) and 2 seconds of 16 kHz audio.\naudio_samples = torch.rand(size=(8, 2, 32000), dtype=torch.float32, device=torch_device) - 0.5\n\n# Apply augmentation.\nperturbed_audio_samples = apply_augmentation(audio_samples, sample_rate=16000)\n\nprint(f\"Original audio shape: {audio_samples.shape}\")\nprint(f\"Perturbed audio shape: {perturbed_audio_samples.shape}\")\nprint(f\"Running on device: {torch_device}\")","lang":"python","description":"This example demonstrates how to apply a sequence of audio augmentations (Gain and PolarityInversion) to a batch of audio samples using `Compose`. It dynamically selects between CPU and GPU for processing."},"warnings":[{"fix":"Ensure all input audio tensors are 3-dimensional, even for mono audio (e.g., shape `(batch_size, 1, num_samples)`).","message":"Support for 1-dimensional and 2-dimensional audio tensors was removed. Only 3-dimensional audio tensors (batch_size, num_channels, num_samples) are supported.","severity":"breaking","affected_versions":"<0.5.0 to 0.5.0+"},{"fix":"Switch to `ObjectDict` output where applicable. Consult the documentation for the exact migration path.","message":"The default `torch.Tensor` output type is deprecated. An `ObjectDict` output type is available and is the recommended future-proof option. Support for `torch.Tensor` output will be removed in a future version.","severity":"deprecated","affected_versions":"0.11.0+"},{"fix":"If experiencing memory leaks in a multiprocessing setup, consider running the transforms on the CPU or setting `num_workers=0` for the DataLoader.","message":"Using `torch-audiomentations` in a multiprocessing context (e.g., with PyTorch's `DataLoader` `num_workers > 0`) can lead to memory leaks.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For now, it's recommended to run transforms on a single GPU. Engage with the project maintainers if multi-GPU support is critical for your use case.","message":"Multi-GPU (DDP) setups are not officially supported due to testing limitations and may not work as expected.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure `torchaudio` is installed and up-to-date. Any code implicitly relying on `librosa` being present through `torch-audiomentations` will break.","message":"The `librosa` dependency was entirely removed in favor of `torchaudio`.","severity":"breaking","affected_versions":"0.12.0+"},{"fix":"Update `torchaudio` to at least version `0.9.0` (e.g., `pip install torchaudio>=0.9.0`).","message":"The minimum `torchaudio` dependency was bumped from `>=0.7.0` to `>=0.9.0`.","severity":"breaking","affected_versions":"0.11.1+"}],"env_vars":null,"last_verified":"2026-04-10T00:00:00.000Z","next_check":"2026-07-09T00:00:00.000Z"}