LiveKit OpenAI Plugin
The `livekit-plugins-openai` library provides an Agent Framework plugin for integrating OpenAI services, including the Realtime API, LLM, TTS, and STT capabilities. It also supports a wide range of OpenAI-compatible APIs such as Azure OpenAI, Cerebras, Fireworks, and Ollama. It is part of the LiveKit Agents ecosystem, designed for building real-time, multimodal AI applications. The library is actively maintained with frequent releases, with the current version being 1.5.2.
Warnings
- breaking LiveKit Agents v1.5.0 introduced 'Adaptive Interruption Handling' which is enabled by default. This feature changes how the agent processes user interruptions (barge-ins vs. back-channels) based on an audio-based ML model, potentially altering previous interruption behaviors based on simpler VAD. While not a breaking API change for `livekit-plugins-openai`, it fundamentally alters agent interaction dynamics within the `livekit-agents` framework.
- gotcha When using `openai.realtime.RealtimeModel` but intending to use a separate Text-to-Speech (TTS) provider (e.g., for custom voices), you *must* explicitly set the `modalities` parameter to `['text']`. Otherwise, the `RealtimeModel` will handle TTS internally, bypassing your custom TTS integration.
- gotcha OpenAI API keys or provider-specific keys (e.g., `AZURE_OPENAI_API_KEY`, `DEEPSEEK_API_KEY`, `COMETAPI_API_KEY`) are generally required for all plugin usage. If not passed as an argument to the constructor (e.g., `api_key=...`), the plugin attempts to read them from environment variables (e.g., `OPENAI_API_KEY`). Misconfiguration is a common source of errors.
- gotcha For OpenAI LLM integrations, LiveKit provides two API modes: the `Responses API` (accessed via `openai.responses.LLM`) and the `Chat Completions API` (accessed via `openai.LLM`). The `Responses API` is recommended for new Python projects due to its support for provider tools (like web search) and newer capabilities, while `Chat Completions API` is for Node.js or legacy Python compatibility. Choosing the wrong one might lead to missing features or unexpected behavior.
- gotcha Older versions (prior to 0.10.9) of `livekit-plugins-openai` for TTS had different default audio output formats. Version 0.10.9 and later standardized on PCM, which might cause incompatibility with some local or third-party TTS services that only support formats like MP3.
Install
-
pip install livekit-plugins-openai
Imports
- openai
from livekit.plugins import openai
- RealtimeModel
from livekit.plugins.openai.realtime import RealtimeModel
- LLM
from livekit.plugins.openai import LLM
- STT
from livekit.plugins.openai import STT
- TTS
from livekit.plugins.openai import TTS
- responses.LLM
from livekit.plugins.openai.responses import LLM
Quickstart
import os
from livekit.agents import AgentSession, JobContext
from livekit.plugins import openai
# Set your OpenAI API key as an environment variable or pass it directly.
# os.environ["OPENAI_API_KEY"] = "sk-..."
async def my_agent_entrypoint(ctx: JobContext):
# Ensure OPENAI_API_KEY is set in your environment
openai_api_key = os.environ.get('OPENAI_API_KEY', '')
if not openai_api_key:
print("Error: OPENAI_API_KEY environment variable not set.")
return
# Use the OpenAI Realtime API for a voice AI agent
# The RealtimeModel combines STT, LLM, and TTS for low-latency interactions.
# 'voice' parameter selects the voice for speech generation.
session = AgentSession(
llm=openai.realtime.RealtimeModel(
voice="marin", # Example voice, see OpenAI docs for options
api_key=openai_api_key
)
)
print("LiveKit Agent with OpenAI Realtime API started. Connect a participant to interact.")
# The agent will now handle real-time audio and text interactions
# based on the configured OpenAI RealtimeModel.
# For a full agent loop, you would typically yield control to the LiveKit framework.
# In a real LiveKit Agents application, `my_agent_entrypoint` would be
# registered with the agent server to run for each new session.
# This example demonstrates the core setup for the OpenAI plugin.