pyannote-pipeline

4.0.0 · active · verified Fri Apr 10

pyannote-pipeline is a component of the pyannote.audio open-source toolkit, specializing in tunable and state-of-the-art pipelines for speaker diarization. Built on the PyTorch machine learning framework, it enables tasks such as speaker segmentation, embedding, and clustering. The library is actively maintained, with its current version 4.0.0 reflecting continuous development and integration within the broader pyannote.audio ecosystem.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to load a pretrained speaker diarization pipeline from Hugging Face. It highlights the critical need for a Hugging Face access token, stored as an environment variable, and acceptance of the model's user conditions. The example then shows how to instantiate the pipeline and prepare to apply it to an audio file. Note that `ffmpeg` must be installed on your system for audio processing.

import os
from pyannote.audio import Pipeline

# Ensure you have a Hugging Face access token set as an environment variable
# and have accepted user conditions for 'pyannote/speaker-diarization-community-1'
# on hf.co/pyannote/speaker-diarization-community-1
hf_token = os.environ.get('HUGGINGFACE_ACCESS_TOKEN', '')
if not hf_token:
    print("Error: HUGGINGFACE_ACCESS_TOKEN environment variable not set.")
    print("Please create a token at hf.co/settings/tokens and set it.")
    exit()

# Instantiate a pretrained speaker diarization pipeline
try:
    pipeline = Pipeline.from_pretrained(
        "pyannote/speaker-diarization-community-1", 
        token=hf_token
    )
except Exception as e:
    print(f"Failed to load pipeline: {e}")
    print("Make sure your Hugging Face token is valid and you've accepted user conditions.")
    exit()

# Example: Apply the pipeline to an audio file (replace 'audio.wav' with your path)
# For demonstration, we'll simulate a file path.
# In a real scenario, you would have 'audio.wav' present.
audio_file_path = "dummy_audio.wav" # Replace with actual audio file path

# This part of the code is illustrative as 'dummy_audio.wav' won't exist.
# You would typically pass a real audio file path here.
print(f"Attempting to apply pipeline to {audio_file_path}...")
# For actual execution, ensure 'ffmpeg' is installed and 'audio.wav' exists.
# output = pipeline(audio_file_path)
# print("Diarization results:")
# for turn, speaker in output.speaker_diarization:
#     print(f"start={turn.start:.1f}s stop={turn.end:.1f}s speaker={speaker}")

print("Pipeline loaded successfully. To run, replace 'dummy_audio.wav' with your audio file and ensure ffmpeg is installed.")
print("Note: The actual application of the pipeline to 'dummy_audio.wav' is commented out as it requires a real audio file and ffmpeg.")

view raw JSON →