Descript Audiotools
Descript Audiotools is a Python library providing object-oriented handling of audio data, featuring functionalities like fast augmentation routines, batching, padding, and GPU-powered operations. The current stable version available on PyPI is 0.7.2. The library is actively maintained, with development continuing on its GitHub repository.
Warnings
- breaking Version 0.7.2 of `descript-audiotools` has a strict dependency on `protobuf<3.20,>=3.9.2`. Installing with newer `protobuf` versions (e.g., 4.x or 5.x) will lead to dependency conflicts and prevent installation or cause runtime errors.
- gotcha The `AudioSignal.play()` method relies on the external `ffplay` command-line utility for audio playback. If `ffplay` (part of FFmpeg) is not installed on your system or not in your system's PATH, `signal.play()` calls will fail silently or with an error about the command not being found.
Install
-
pip install descript-audiotools
Imports
- AudioSignal
import audiotools.AudioSignal
from audiotools import AudioSignal
- audiotools
import audiotools
Quickstart
import audiotools
import numpy as np
# Create a dummy AudioSignal (e.g., a sine wave)
sample_rate = 44100
duration = 3 # seconds
frequency = 440 # Hz
t = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
sine_wave = 0.5 * np.sin(2 * np.pi * frequency * t)
# AudioSignal expects a torch.Tensor, so convert numpy array
signal = audiotools.AudioSignal(sine_wave[np.newaxis, :], sample_rate)
print(f"Original signal sample rate: {signal.sample_rate}")
print(f"Original signal duration: {signal.duration} seconds")
# Apply a low-pass filter
filtered_signal = signal.low_pass(8000)
print(f"Filtered signal duration: {filtered_signal.duration} seconds")
# To play back audio, ffplay (part of FFmpeg) must be installed.
# signal.play()
# filtered_signal.play()