LiveKit AssemblyAI Plugin
LiveKit AssemblyAI Plugin is an Agent Framework plugin for integrating AssemblyAI's Speech-to-Text (STT) capabilities into LiveKit AI Agents. It enables real-time transcription for conversational AI applications. The library is actively maintained with frequent releases, often in sync with the core `livekit-agents` library, and is currently at version 1.5.4.
Common errors
-
ValueError: ASSEMBLYAI_API_KEY not set
cause The AssemblyAI STT plugin requires an API key for authentication, which is typically read from the `ASSEMBLYAI_API_KEY` environment variable.fixSet the `ASSEMBLYAI_API_KEY` environment variable in your shell or a `.env` file that is loaded by your application. Ensure `python-dotenv` is installed if using `.env` files. `export ASSEMBLYAI_API_KEY='your_key_here'` -
TypeError: AgentSession() got an unexpected keyword argument 'min_endpointing_delay'
cause In `livekit-agents` v1.5.0, turn handling parameters like `min_endpointing_delay` were deprecated and moved into a consolidated `TurnHandlingOptions` dictionary passed via the `turn_handling` argument.fixUpdate your `AgentSession` initialization to use the `turn_handling` dictionary. For example, instead of `AgentSession(min_endpointing_delay=0.5)`, use `AgentSession(turn_handling={'endpointing': {'min_delay': 0.5}})`.
Warnings
- breaking LiveKit Agents 1.5.0 deprecated several keyword arguments for turn handling, such as `min_endpointing_delay` and `allow_interruptions`. These are now consolidated into a `TurnHandlingOptions` dictionary. Using old kwargs will still work but will emit deprecation warnings and will be removed in a future v2.0 release.
- gotcha When using AssemblyAI's native turn detection (`turn_detection="stt"`) in LiveKit Agents, the `max_turn_silence` default value differs from AssemblyAI's API. The LiveKit plugin defaults to `max_turn_silence=100`, while the AssemblyAI API's default is `max_turn_silence=1000`. This can lead to shorter turn durations than expected.
- gotcha Preemptive generation is enabled by default in LiveKit Agents 1.5.0 and newer. This starts LLM and TTS inference before a user's turn fully completes, reducing latency but potentially changing agent conversational flow.
Install
-
pip install "livekit-agents[assemblyai]~=1.5" -
pip install livekit-plugins-assemblyai
Imports
- STT
from livekit.plugins import assemblyai
- AgentSession
from livekit.agents import AgentSession
Quickstart
import os
from livekit.agents import AgentSession, JobContext
from livekit.plugins import assemblyai
from livekit.agents.voice import VoiceAgent
from dotenv import load_dotenv
load_dotenv()
class MyAssemblyAIAgent(VoiceAgent):
def __init__(self, ctx: JobContext):
super().__init__(ctx)
self.stt = assemblyai.STT(
api_key=os.environ.get('ASSEMBLYAI_API_KEY', '')
)
async def start(self):
# Example: Using AssemblyAI STT directly (not in a full agent pipeline)
# For a full agent, you'd pass self.stt to AgentSession's stt parameter.
print("AssemblyAI STT initialized. API Key status: ", "Set" if self.stt.api_key else "Not Set")
if __name__ == "__main__":
# In a real application, you'd run the agent via LiveKit's infrastructure.
# This is a minimal example to show STT initialization.
if not os.environ.get('ASSEMBLYAI_API_KEY'):
print("Warning: ASSEMBLYAI_API_KEY environment variable is not set. STT may fail.")
# You can instantiate the STT directly for testing or use it within a full AgentSession
my_stt_instance = assemblyai.STT(api_key=os.environ.get('ASSEMBLYAI_API_KEY', ''))
print(f"AssemblyAI STT configured with API key: {'Yes' if my_stt_instance.api_key else 'No'}")