pyannote-metrics
pyannote.metrics is an open-source Python library, currently at version 4.0.0, designed for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems. It provides a comprehensive set of evaluation metrics and a command-line interface, making it a critical tool for researchers in the field of speech processing. The library maintains a steady release cadence with regular updates and occasional major version changes that introduce breaking modifications.
Warnings
- breaking Version 3.3.0 introduced a breaking change by improving diarization purity and coverage to explicitly account for overlapping regions, which might alter previously obtained metric values for systems that handle overlap differently.
- gotcha Comparison of evaluation scores across different diarization evaluation tools (e.g., `pyannote.metrics` vs. `md-eval`) is not recommended due to varying design choices, default parameters (like collar size), and handling of speaker mapping and overlapping speech.
- gotcha The `collar` parameter, typically set to 0.25 (250 ms exclusion around boundaries), significantly impacts DER. Manual annotations often lack audio sample-level precision, making a collar common practice. However, strict benchmarks may use `collar=0.0`.
- breaking Older versions (2.0.1) dropped support for Python 2.7 and all file formats except RTTM for evaluation. Ensure your environment uses Python 3.10+ and RTTM for input annotations.
Install
-
pip install pyannote-metrics
Imports
- DiarizationErrorRate
from pyannote.metrics.diarization import DiarizationErrorRate
- Annotation
from pyannote.core import Annotation
- Segment
from pyannote.core import Segment
Quickstart
from pyannote.core import Segment, Annotation
from pyannote.metrics.diarization import DiarizationErrorRate
# Define a reference (ground truth) annotation
reference = Annotation(uri='file1')
reference[Segment(0, 10)] = 'A'
reference[Segment(12, 20)] = 'B'
reference[Segment(24, 27)] = 'A'
reference[Segment(30, 40)] = 'C'
# Define a hypothesis (system output) annotation
hypothesis = Annotation(uri='file1')
hypothesis[Segment(2, 13)] = 'a'
hypothesis[Segment(13, 14)] = 'd'
hypothesis[Segment(14, 20)] = 'b'
hypothesis[Segment(22, 38)] = 'c'
hypothesis[Segment(38, 40)] = 'd'
# Instantiate the Diarization Error Rate metric
metric = DiarizationErrorRate()
# Compute the DER
der_value = metric(reference, hypothesis)
print(f"Diarization Error Rate: {der_value:.3f}")