{"id":5299,"library":"livekit-plugins-cartesia","title":"LiveKit Agents Plugin for Cartesia","description":"livekit-plugins-cartesia is a Python plugin for LiveKit Agents, providing Text-to-Speech (TTS) and Speech-to-Text (STT) capabilities using Cartesia's AI services. It allows developers to connect directly to Cartesia's API with their own API key, offering an alternative to LiveKit Inference for managing billing and enabling custom Cartesia voices. This library is crucial for building real-time, conversational AI applications that require Cartesia's advanced voice synthesis and transcription.","status":"active","version":"1.5.2","language":"en","source_language":"en","source_url":"https://github.com/livekit/agents","tags":["ai","audio","cartesia","livekit","realtime","voice","tts","stt","plugin"],"install":[{"cmd":"pip install livekit-plugins-cartesia","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"This package is a plugin for the LiveKit Agents framework.","package":"livekit-agents","optional":false}],"imports":[{"note":"The Cartesia plugin is imported as a module under livekit.plugins.","symbol":"cartesia","correct":"from livekit.plugins import cartesia"},{"note":"TTS is a class within the cartesia plugin module, not directly under livekit.plugins.","wrong":"from livekit.plugins import TTS","symbol":"TTS","correct":"from livekit.plugins.cartesia import TTS"},{"note":"STT is a class within the cartesia plugin module, not directly under livekit.plugins.","wrong":"from livekit.plugins import STT","symbol":"STT","correct":"from livekit.plugins.cartesia import STT"}],"quickstart":{"code":"import os\nimport asyncio\nfrom livekit.agents import AgentSession, JobContext, WorkerOptions, cli\nfrom livekit.plugins import cartesia\n\nclass CartesiaVoiceAgent:\n    @cli.agent_handler(\"voice\")\n    async def agent_handle(self, ctx: JobContext):\n        session = AgentSession(ctx)\n        await session.start(\n            tts=cartesia.TTS(api_key=os.environ.get('CARTESIA_API_KEY', '')),\n            stt=cartesia.STT(api_key=os.environ.get('CARTESIA_API_KEY', '')),\n            # Other agent components like LLM, VAD, etc., would be configured here\n        )\n\n        print(\"Agent started. Listening for speech...\")\n\n        # Example: Say something to the user\n        await session.say(\"Hello! I am a Cartesia-powered voice agent. How can I help you today?\")\n\n        # In a real agent, you would have a loop to process user input (STT) and generate responses (TTS)\n        # For demonstration, we'll just keep the session alive briefly.\n        await asyncio.sleep(60)\n\n\nif __name__ == \"__main__\":\n    # Ensure environment variables are set for LiveKit and Cartesia\n    # For LiveKit: LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET\n    # For Cartesia: CARTESIA_API_KEY\n\n    # Example of how to run the agent (usually via `livekit-cli run`)\n    # For local testing, you might need to set up a mock or local LiveKit server\n    # This quickstart is meant to illustrate the plugin usage, not a full deployment.\n\n    # Set dummy values if env vars are missing for local testing to avoid immediate errors\n    os.environ.setdefault('LIVEKIT_URL', 'wss://your-livekit-server.cloud')\n    os.environ.setdefault('LIVEKIT_API_KEY', 'SK_YOUR_LIVEKIT_API_KEY')\n    os.environ.setdefault('LIVEKIT_API_SECRET', 'YOUR_LIVEKIT_API_SECRET')\n    os.environ.setdefault('CARTESIA_API_KEY', 'YOUR_CARTESIA_API_KEY') # Replace with your actual key\n\n    cli.run(WorkerOptions(agent_handles=[CartesiaVoiceAgent().agent_handle]))","lang":"python","description":"This quickstart demonstrates how to initialize a LiveKit AgentSession with Cartesia's TTS and STT capabilities. It uses `cartesia.TTS` and `cartesia.STT` classes, requiring `CARTESIA_API_KEY` to be set as an environment variable. A full LiveKit Agent setup also requires `LIVEKIT_URL`, `LIVEKIT_API_KEY`, and `LIVEKIT_API_SECRET`."},"warnings":[{"fix":"Refactor `AgentSession` initialization to use the `turn_handling` dictionary for all turn detection and interruption settings, instead of deprecated individual keyword arguments. Refer to the LiveKit Agents 1.5.0 changelog for details.","message":"LiveKit Agents 1.5.0 introduced significant changes to the `TurnHandlingOptions` API. Old keyword arguments for endpointing and interruption (e.g., `min_endpointing_delay`, `allow_interruptions`) are deprecated and will be removed in future versions. Agents should be updated to use the new dictionary-based `turn_handling` parameter.","severity":"breaking","affected_versions":"livekit-agents>=1.5.0"},{"fix":"Ensure `os.environ['CARTESIA_API_KEY']` is set, or pass `api_key='YOUR_CARTESIA_API_KEY'` directly to `cartesia.TTS()` and `cartesia.STT()` during initialization.","message":"The `livekit-plugins-cartesia` plugin requires a Cartesia API key. This key must be provided explicitly to the `TTS` and `STT` constructors or set as the `CARTESIA_API_KEY` environment variable. Without it, Cartesia services will fail to authenticate.","severity":"gotcha","affected_versions":"All versions"},{"fix":"If specific voice characteristics are critical, always explicitly define the `model` and `voice` parameters in `cartesia.TTS()` to ensure consistent behavior across updates. For example, `cartesia.TTS(model='sonic-2', ...)` if you intend to use the older default.","message":"With livekit-agents 1.4.4, the default Cartesia TTS model was upgraded to 'Sonic 3'. If your application previously relied on an older default Cartesia model without explicitly specifying it, the voice output might change unexpectedly after upgrading LiveKit Agents (and by extension, this plugin's underlying dependencies).","severity":"gotcha","affected_versions":"livekit-agents>=1.4.4"}],"env_vars":null,"last_verified":"2026-04-13T00:00:00.000Z","next_check":"2026-07-12T00:00:00.000Z"}