{"id":4087,"library":"livekit-plugins-openai","title":"LiveKit OpenAI Plugin","description":"The `livekit-plugins-openai` library provides an Agent Framework plugin for integrating OpenAI services, including the Realtime API, LLM, TTS, and STT capabilities. It also supports a wide range of OpenAI-compatible APIs such as Azure OpenAI, Cerebras, Fireworks, and Ollama. It is part of the LiveKit Agents ecosystem, designed for building real-time, multimodal AI applications. The library is actively maintained with frequent releases, with the current version being 1.5.2.","status":"active","version":"1.5.2","language":"en","source_language":"en","source_url":"https://github.com/livekit/agents","tags":["livekit","openai","agent","llm","speech-to-text","text-to-speech","realtime","voiceai","multimodal"],"install":[{"cmd":"pip install livekit-plugins-openai","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"This plugin is built on top of the LiveKit Agents framework.","package":"livekit-agents","optional":false},{"reason":"Required for interacting with OpenAI APIs directly (implicitly used by the plugin).","package":"openai","optional":false}],"imports":[{"symbol":"openai","correct":"from livekit.plugins import openai"},{"symbol":"RealtimeModel","correct":"from livekit.plugins.openai.realtime import RealtimeModel"},{"note":"The LLM class is directly exposed under `livekit.plugins.openai` as of recent versions, rather than a submodule.","wrong":"from livekit.plugins.openai.llm import LLM","symbol":"LLM","correct":"from livekit.plugins.openai import LLM"},{"symbol":"STT","correct":"from livekit.plugins.openai import STT"},{"symbol":"TTS","correct":"from livekit.plugins.openai import TTS"},{"note":"Use `openai.responses.LLM` for the newer Responses API, recommended for Python projects when using OpenAI LLMs with provider tools.","symbol":"responses.LLM","correct":"from livekit.plugins.openai.responses import LLM"}],"quickstart":{"code":"import os\nfrom livekit.agents import AgentSession, JobContext\nfrom livekit.plugins import openai\n\n# Set your OpenAI API key as an environment variable or pass it directly.\n# os.environ[\"OPENAI_API_KEY\"] = \"sk-...\"\n\nasync def my_agent_entrypoint(ctx: JobContext):\n    # Ensure OPENAI_API_KEY is set in your environment\n    openai_api_key = os.environ.get('OPENAI_API_KEY', '')\n    if not openai_api_key:\n        print(\"Error: OPENAI_API_KEY environment variable not set.\")\n        return\n\n    # Use the OpenAI Realtime API for a voice AI agent\n    # The RealtimeModel combines STT, LLM, and TTS for low-latency interactions.\n    # 'voice' parameter selects the voice for speech generation.\n    session = AgentSession(\n        llm=openai.realtime.RealtimeModel(\n            voice=\"marin\", # Example voice, see OpenAI docs for options\n            api_key=openai_api_key\n        )\n    )\n    print(\"LiveKit Agent with OpenAI Realtime API started. Connect a participant to interact.\")\n\n    # The agent will now handle real-time audio and text interactions\n    # based on the configured OpenAI RealtimeModel.\n    # For a full agent loop, you would typically yield control to the LiveKit framework.\n\n# In a real LiveKit Agents application, `my_agent_entrypoint` would be\n# registered with the agent server to run for each new session.\n# This example demonstrates the core setup for the OpenAI plugin.","lang":"python","description":"This quickstart demonstrates how to initialize a LiveKit AgentSession using the `livekit-plugins-openai` for real-time voice AI. It configures the session to use OpenAI's Realtime API, which integrates speech-to-text (STT), large language model (LLM), and text-to-speech (TTS) for low-latency, multimodal interactions. Ensure your `OPENAI_API_KEY` is set as an environment variable."},"warnings":[{"fix":"Review your agent's conversational flow and adjust if the new interruption handling behavior negatively impacts the user experience. You can configure `turn_detection` options within `RealtimeModel` or other components if needed, though adaptive handling is often preferred for more natural conversations.","message":"LiveKit Agents v1.5.0 introduced 'Adaptive Interruption Handling' which is enabled by default. This feature changes how the agent processes user interruptions (barge-ins vs. back-channels) based on an audio-based ML model, potentially altering previous interruption behaviors based on simpler VAD. While not a breaking API change for `livekit-plugins-openai`, it fundamentally alters agent interaction dynamics within the `livekit-agents` framework.","severity":"breaking","affected_versions":"livekit-agents >=1.5.0 (which livekit-plugins-openai 1.5.x depends on)"},{"fix":"Configure `modalities=['text']` in your `openai.realtime.RealtimeModel` instance if you're providing a separate TTS engine to the `AgentSession`. Example: `llm=openai.realtime.RealtimeModel(modalities=[\"text\"], api_key=...)`","message":"When using `openai.realtime.RealtimeModel` but intending to use a separate Text-to-Speech (TTS) provider (e.g., for custom voices), you *must* explicitly set the `modalities` parameter to `['text']`. Otherwise, the `RealtimeModel` will handle TTS internally, bypassing your custom TTS integration.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure the relevant API key is correctly set as an environment variable (e.g., `OPENAI_API_KEY`) or explicitly passed to the constructor of `LLM`, `STT`, `TTS`, or `RealtimeModel` instances.","message":"OpenAI API keys or provider-specific keys (e.g., `AZURE_OPENAI_API_KEY`, `DEEPSEEK_API_KEY`, `COMETAPI_API_KEY`) are generally required for all plugin usage. If not passed as an argument to the constructor (e.g., `api_key=...`), the plugin attempts to read them from environment variables (e.g., `OPENAI_API_KEY`). Misconfiguration is a common source of errors.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For new Python projects, prefer `openai.responses.LLM` for OpenAI LLMs to leverage the latest features and provider tools. Use `openai.LLM` for compatibility or Node.js agents.","message":"For OpenAI LLM integrations, LiveKit provides two API modes: the `Responses API` (accessed via `openai.responses.LLM`) and the `Chat Completions API` (accessed via `openai.LLM`). The `Responses API` is recommended for new Python projects due to its support for provider tools (like web search) and newer capabilities, while `Chat Completions API` is for Node.js or legacy Python compatibility. Choosing the wrong one might lead to missing features or unexpected behavior.","severity":"gotcha","affected_versions":"All versions"},{"fix":"If experiencing issues with TTS audio format incompatibility, check if your TTS provider supports PCM or if `livekit-plugins-openai` (or the underlying OpenAI API) offers a configuration option for `output_format`. If not, you might need to convert audio formats post-generation or use a different TTS provider.","message":"Older versions (prior to 0.10.9) of `livekit-plugins-openai` for TTS had different default audio output formats. Version 0.10.9 and later standardized on PCM, which might cause incompatibility with some local or third-party TTS services that only support formats like MP3.","severity":"gotcha","affected_versions":">=0.10.9"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}