pyannoteAI Python SDK
The official pyannoteAI Python SDK provides a convenient way to interact with the pyannoteAI platform. It offers state-of-the-art AI models for tasks like speaker diarization (who spoke when), speaker identification, and STT orchestration (speech-to-text with speaker attribution) via a cloud API. The library is actively maintained, with frequent releases bringing new features and improvements.
Warnings
- gotcha API keys are sensitive. Do not hardcode them in your codebase or expose them in client-side code or public repositories. It is highly recommended to use environment variables (e.g., `PYANNOTE_API_KEY`) for secure access.
- gotcha The `diarize` and `identify` methods require a direct, publicly accessible URL to your audio file. Indirect URLs (requiring authentication, CAPTCHA, or a landing page) or private cloud storage URLs will result in 'Could not load audio' errors. Ensure the URL points directly to the file itself (e.g., ends with .wav, .mp3).
- gotcha The pyannoteAI API imposes limits on file size and audio duration. Diarization and identification jobs support files up to 1 GiB and 24 hours duration. Voiceprint jobs have tighter limits of 100 MiB and 30 seconds duration.
- gotcha The `pyannoteai-sdk` interacts with the pyannoteAI cloud platform, which utilizes premium, improved versions of the pyannote diarization models. These are distinct from the open-source `pyannote.audio` library, offering potentially higher accuracy and speed for specific use cases.
- gotcha Job results from the pyannoteAI API are automatically deleted after 24 hours. If you need to retain the results for longer, you must retrieve and save them to your own storage within this timeframe.
Install
-
pip install pyannoteai-sdk
Imports
- Client
from pyannoteai.sdk import Client
Quickstart
import os
from pyannoteai.sdk import Client
# Ensure your API key is set as an environment variable or passed directly
# It's recommended to set PYANNOTE_API_KEY as an environment variable in production
api_key = os.environ.get('PYANNOTE_API_KEY', 'YOUR_API_KEY_HERE')
if api_key == 'YOUR_API_KEY_HERE':
print("Warning: Please replace 'YOUR_API_KEY_HERE' with your actual pyannoteAI API key or set the PYANNOTE_API_KEY environment variable.")
client = Client(api_key)
# Example audio URL (replace with your audio file accessible via a direct public URL)
audio_url = "https://files.pyannote.ai/samples/two_speakers.wav"
try:
# Submit a diarization job
job_id = client.diarize(audio_url)
print(f"Diarization job submitted with ID: {job_id}")
# Wait for the job to complete and retrieve results
# In a real application, you might use webhooks or polling for job status
job_result = client.get_job_result(job_id)
if job_result and job_result.get("status") == "completed":
print("Diarization Result:")
for segment in job_result.get("output", {}).get("segments", []):
start = segment["start"]
end = segment["end"]
speaker = segment["speaker"]
print(f"[ {start:.2f}s - {end:.2f}s ] Speaker {speaker}")
else:
print(f"Job {job_id} status: {job_result.get('status')}")
print(f"Error: {job_result.get('error', 'Unknown error')}")
except Exception as e:
print(f"An error occurred: {e}")