Torch Pitch Shift
torch-pitch-shift is a Python library that enables rapid pitch-shifting of audio clips using PyTorch, with full CUDA support. It also provides utilities for calculating efficient pitch-shift targets, which is particularly useful for augmentation scenarios where speed is prioritized over precise pitch-shifts. The library is currently at version 1.2.5 and maintains an active development and release cadence.
Warnings
- breaking Older versions of `torch-pitch-shift` had specific compatibility requirements with `torchaudio` versions. For instance, v1.2.2 and v1.2.1 introduced backwards compatibility and explicit support for `torchaudio<=0.11.0` and `torchaudio v0.11` respectively. Users experiencing issues should check their `torchaudio` version and update `torch-pitch-shift` accordingly.
- gotcha For improved audio quality and to reduce distortion during pitch-shifting, ensure you are using `torch-pitch-shift` version 1.2.0 or newer. This version introduced the `hop_length` argument, which significantly impacts output quality.
- gotcha The `get_fast_shifts` utility function may fail to compute valid pitch-shift ratios for certain sample rates or transpose ranges, raising an error. This typically occurs when no efficient fractional shifts can be found for the given parameters.
- gotcha The library has been flagged for using dynamic code execution (e.g., `eval()`) by security analysis tools like Socket.dev. While not necessarily a vulnerability, this practice can pose security risks and may prevent the code from running in environments with strict security policies.
- gotcha While `torch-pitch-shift` is designed for speed with PyTorch/CUDA, performing pitch shifts on CPU can be significantly slower. Related libraries have also noted performance bottlenecks on CPU for similar operations, which might apply here.
Install
-
pip install torch-pitch-shift
Imports
- pitch_shift
from torch_pitch_shift import pitch_shift
- get_fast_shifts
from torch_pitch_shift import get_fast_shifts
- functional.pitch_shift
from torch_pitch_shift import pitch_shift
Quickstart
import torch
from torch_pitch_shift import pitch_shift
# Create a dummy mono audio waveform (batch_size, channels, samples)
sample_rate = 44100
duration_seconds = 2
num_samples = sample_rate * duration_seconds
waveform = torch.randn(1, 1, num_samples, dtype=torch.float32)
# Define the pitch shift amount in semitones
shift_semitones = 3.0 # Shift up by 3 semitones
# Perform the pitch shift
pitch_shifted_waveform = pitch_shift(waveform, shift_semitones, sample_rate)
print(f"Original waveform shape: {waveform.shape}")
print(f"Pitch-shifted waveform shape: {pitch_shifted_waveform.shape}")
# In a real scenario, you would now save or play 'pitch_shifted_waveform'