LiveKit Agents Plugin for Cartesia

1.5.2 · active · verified Mon Apr 13

livekit-plugins-cartesia is a Python plugin for LiveKit Agents, providing Text-to-Speech (TTS) and Speech-to-Text (STT) capabilities using Cartesia's AI services. It allows developers to connect directly to Cartesia's API with their own API key, offering an alternative to LiveKit Inference for managing billing and enabling custom Cartesia voices. This library is crucial for building real-time, conversational AI applications that require Cartesia's advanced voice synthesis and transcription.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize a LiveKit AgentSession with Cartesia's TTS and STT capabilities. It uses `cartesia.TTS` and `cartesia.STT` classes, requiring `CARTESIA_API_KEY` to be set as an environment variable. A full LiveKit Agent setup also requires `LIVEKIT_URL`, `LIVEKIT_API_KEY`, and `LIVEKIT_API_SECRET`.

import os
import asyncio
from livekit.agents import AgentSession, JobContext, WorkerOptions, cli
from livekit.plugins import cartesia

class CartesiaVoiceAgent:
    @cli.agent_handler("voice")
    async def agent_handle(self, ctx: JobContext):
        session = AgentSession(ctx)
        await session.start(
            tts=cartesia.TTS(api_key=os.environ.get('CARTESIA_API_KEY', '')),
            stt=cartesia.STT(api_key=os.environ.get('CARTESIA_API_KEY', '')),
            # Other agent components like LLM, VAD, etc., would be configured here
        )

        print("Agent started. Listening for speech...")

        # Example: Say something to the user
        await session.say("Hello! I am a Cartesia-powered voice agent. How can I help you today?")

        # In a real agent, you would have a loop to process user input (STT) and generate responses (TTS)
        # For demonstration, we'll just keep the session alive briefly.
        await asyncio.sleep(60)


if __name__ == "__main__":
    # Ensure environment variables are set for LiveKit and Cartesia
    # For LiveKit: LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET
    # For Cartesia: CARTESIA_API_KEY

    # Example of how to run the agent (usually via `livekit-cli run`)
    # For local testing, you might need to set up a mock or local LiveKit server
    # This quickstart is meant to illustrate the plugin usage, not a full deployment.

    # Set dummy values if env vars are missing for local testing to avoid immediate errors
    os.environ.setdefault('LIVEKIT_URL', 'wss://your-livekit-server.cloud')
    os.environ.setdefault('LIVEKIT_API_KEY', 'SK_YOUR_LIVEKIT_API_KEY')
    os.environ.setdefault('LIVEKIT_API_SECRET', 'YOUR_LIVEKIT_API_SECRET')
    os.environ.setdefault('CARTESIA_API_KEY', 'YOUR_CARTESIA_API_KEY') # Replace with your actual key

    cli.run(WorkerOptions(agent_handles=[CartesiaVoiceAgent().agent_handle]))

view raw JSON →