pystoi
pystoi is a Python library that computes the Short Term Objective Intelligibility (STOI) measure, a metric highly correlated with the subjective intelligibility of degraded speech signals. It is an intrusive measure, requiring both clean and degraded speech inputs. It serves as an objective alternative for evaluating the effect of non-linear processing like noise reduction or binary masking on speech intelligibility. The current version is 0.4.1, and its development appears to be in maintenance mode.
Warnings
- gotcha The `stoi` function requires the clean and degraded speech signals to have the exact same length. Providing inputs of different lengths will raise an `Exception`.
- gotcha pystoi does not natively support batched processing of multiple audio files. Users requiring batch computation for performance may need to iterate or consider using forks like `batch-pystoi` (a separate package) which provides this functionality.
- gotcha The `pystoi` library performs computations exclusively on the CPU. It does not leverage GPU acceleration, even when used within frameworks like PyTorch (e.g., via `torchmetrics` wrappers).
- gotcha A separate project, `pytorch_stoi` (distinct from `pystoi`), provides a PyTorch implementation of STOI intended for use as a loss function. This implementation is an *approximation* and may not yield numerically identical results to the original `pystoi` library, which is the reference for exact STOI calculation.
Install
-
pip install pystoi
Imports
- stoi
from pystoi import stoi
Quickstart
import soundfile as sf
import numpy as np
from pystoi import stoi
import os
# Create dummy audio files for demonstration
fs = 10000 # Sample rate
duration = 1 # seconds
clean_signal = np.random.rand(fs * duration).astype(np.float32) * 0.5 # Clean speech
denoised_signal = clean_signal + (np.random.rand(fs * duration).astype(np.float32) - 0.5) * 0.1 # Denoised (noisy) speech
# Save dummy files
sf.write('clean.wav', clean_signal, fs)
sf.write('denoised.wav', denoised_signal, fs)
# Load the audio files (replace with your actual paths)
clean_audio, fs_clean = sf.read('clean.wav')
denoised_audio, fs_denoised = sf.read('denoised.wav')
# Ensure sample rates are consistent, pystoi will resample if needed to 10kHz internally
assert fs_clean == fs_denoised, "Sample rates must match"
# Compute STOI
score = stoi(clean_audio, denoised_audio, fs_clean, extended=False)
print(f"STOI score: {score:.4f}")
# Clean up dummy files
os.remove('clean.wav')
os.remove('denoised.wav')