{"id":8723,"library":"torchlibrosa","title":"TorchLibrosa: PyTorch Implementation of Librosa","description":"TorchLibrosa provides a PyTorch implementation of core `librosa` audio feature extraction functions, enabling GPU acceleration for tasks such as spectrogram and mel-spectrogram computation. This is particularly beneficial for deep learning pipelines that require faster feature generation on GPUs during training and evaluation. The library aims for numerical results almost identical to CPU-based `librosa` (difference less than 1e-5). The current version is 0.1.0, with an infrequent release cadence; the latest release was in February 2023.","status":"active","version":"0.1.0","language":"en","source_language":"en","source_url":"https://github.com/qiuqiangkong/torchlibrosa","tags":["audio processing","deep learning","pytorch","librosa","feature extraction","gpu acceleration","spectrogram","mel-spectrogram"],"install":[{"cmd":"pip install torchlibrosa","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Numerical operations","package":"numpy","optional":false},{"reason":"Provides reference CPU implementations and utility functions; torchlibrosa aims to replicate its functionality.","package":"librosa>=0.9.0","optional":false},{"reason":"Core deep learning framework; torchlibrosa is built on PyTorch.","package":"torch","optional":false}],"imports":[{"symbol":"Spectrogram","correct":"from torchlibrosa import Spectrogram"},{"symbol":"LogmelFilterBank","correct":"from torchlibrosa import LogmelFilterBank"},{"symbol":"STFT","correct":"from torchlibrosa import STFT"},{"symbol":"ISTFT","correct":"from torchlibrosa import ISTFT"},{"note":"Common alias for convenience.","symbol":"torchlibrosa as tl","correct":"import torchlibrosa as tl"}],"quickstart":{"code":"import torch\nimport torchlibrosa as tl\n\nbatch_size = 16\nsample_rate = 22050\nwin_length = 2048\nhop_length = 512\nn_mels = 128\n\n# Create a batch of dummy audio (e.g., for GPU processing)\nbatch_audio = torch.empty(batch_size, sample_rate).uniform_(-1, 1)\n\n# Instantiate a feature extractor using Sequential for a pipeline\nfeature_extractor = torch.nn.Sequential(\n    tl.Spectrogram(\n        hop_length=hop_length,\n        win_length=win_length,\n    ),\n    tl.LogmelFilterBank(\n        sr=sample_rate,\n        n_mels=n_mels,\n        is_log=False, # Default is true\n    )\n)\n\n# Process the audio to get log mel spectrograms\nbatch_feature = feature_extractor(batch_audio)\n\nprint(f\"Input audio shape: {batch_audio.shape}\")\nprint(f\"Output feature shape (batch_size, 1, time_steps, mel_bins): {batch_feature.shape}\")","lang":"python","description":"This quickstart demonstrates how to extract log mel spectrograms from a batch of audio signals using `torchlibrosa`'s `Spectrogram` and `LogmelFilterBank` modules, designed to work seamlessly within a PyTorch `nn.Sequential` model."},"warnings":[{"fix":"Ensure PyTorch is installed manually via `pip install torch` (and `torchvision`, `torchaudio` if needed) or by specifying it in your project's `requirements.txt`.","message":"PyTorch is a fundamental dependency for `torchlibrosa`, but it is not explicitly listed in the `install_requires` of the `setup.py`. This can lead to a `ModuleNotFoundError` if PyTorch is not installed separately.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Be aware of the minor numerical differences. For most deep learning applications, this level of difference is acceptable. If exact parity is critical, consider using `librosa` for CPU-based processing and carefully benchmark results.","message":"While `torchlibrosa` aims to provide 'almost identical features' to `librosa`, a 'numerical difference less than 1e-5' is explicitly stated. Users expecting bit-for-bit identical results to `librosa` on CPU might encounter minor discrepancies.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Upgrade to the latest `torchlibrosa` (0.1.0) and ensure compatibility with your PyTorch version. Check GitHub issues for reported incompatibilities if problems persist.","message":"Older versions of `torchlibrosa` (e.g., 0.0.9) had compatibility issues with specific PyTorch versions (e.g., `torch=1.10.0+cu111`), potentially leading to runtime errors related to internal function calls or argument mismatches.","severity":"breaking","affected_versions":"<=0.0.9"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Install PyTorch separately: `pip install torch` (or the appropriate version for your CUDA setup).","cause":"`torchlibrosa` implicitly depends on `torch` but does not list it in its `install_requires`.","error":"ModuleNotFoundError: No module named 'torch'"},{"fix":"Update `torchlibrosa` to version 0.1.0 and ensure your `librosa` dependency is `>=0.9.0` as specified. If the issue persists, try updating PyTorch to a recent stable version.","cause":"This error has been reported in older versions (e.g., 0.0.9) when initializing `STFT` or `Spectrogram`, indicating a potential mismatch in expected arguments for internal padding functions, possibly due to `librosa` or `torch` version incompatibilities.","error":"TypeError: pad_center() takes 1 positional argument but 2 were given"},{"fix":"Carefully review the expected input shape for the `torchlibrosa` module you are using (e.g., `Spectrogram` expects `(batch_size, samples)`). Ensure your input tensor matches these requirements, potentially using `unsqueeze` or `permute` to reshape.","cause":"This is a common PyTorch error, often occurring when input tensors to `torchlibrosa` modules (which internally use matrix multiplications) do not have the expected dimensions or shape.","error":"RuntimeError: mat1 and mat2 shapes cannot be multiplied"}]}