WARP-Q: Quality Prediction For Generative Neural Speech Codecs

1.5.2 · active · verified Thu Apr 16

WARP-Q is a Python library designed for predicting the quality of generative neural speech codecs. It offers a robust framework to assess speech quality, providing access to pretrained models for immediate use and the flexibility to load custom models. The current version is 1.5.2, and the library maintains an active development status with notable API updates, including a significant overhaul at v1.0.0.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the WARP-Q model and predict the quality score between a reference and a degraded speech audio file. It uses a default pretrained model and includes logic to create dummy audio files if they don't exist, making it runnable out-of-the-box. Ensure your actual audio files are in a compatible format (e.g., WAV) and have the same sampling rate.

import warpq
import os

# NOTE: Replace 'ref_audio.wav' and 'deg_audio.wav' with actual paths to your audio files.
# Ensure these files exist in the current directory or provide full paths.
# For example, you might create dummy files for testing:
# import soundfile as sf
# import numpy as np
# sf.write('ref_audio.wav', np.random.rand(16000), 16000)
# sf.write('deg_audio.wav', np.random.rand(16000), 16000)

ref_audio_path = os.path.join(os.getcwd(), 'ref_audio.wav') # Example path
deg_audio_path = os.path.join(os.getcwd(), 'deg_audio.wav') # Example path

# Create dummy audio files for demonstration if they don't exist
if not os.path.exists(ref_audio_path) or not os.path.exists(deg_audio_path):
    import soundfile as sf
    import numpy as np
    sample_rate = 16000
    duration = 1 # second
    data = np.random.rand(int(sample_rate * duration)).astype(np.float32) * 0.5
    sf.write(ref_audio_path, data, sample_rate)
    sf.write(deg_audio_path, data * 0.9, sample_rate)
    print(f"Created dummy audio files: {ref_audio_path}, {deg_audio_path}")

try:
    # Initialize WARP-Q model. model_path=None uses a default pretrained model.
    model = warpq.WARPQ(model_path=None)

    # Predict the WARP-Q score
    score = model.predict(ref_audio_path, deg_audio_path)
    print(f"WARP-Q score for {os.path.basename(deg_audio_path)} (vs {os.path.basename(ref_audio_path)}): {score:.4f}")

except Exception as e:
    print(f"An error occurred: {e}")
    print("Please ensure your audio files exist and are valid (e.g., .wav format, same sample rate).")
    print("Also verify PyTorch and torchaudio are correctly installed.")
finally:
    # Clean up dummy files
    if os.path.exists(ref_audio_path):
        os.remove(ref_audio_path)
    if os.path.exists(deg_audio_path):
        os.remove(deg_audio_path)

view raw JSON →