LiveKit Deepgram Plugin

1.5.2 · active · verified Sat Apr 11

livekit-plugins-deepgram is an Agent Framework plugin that provides integrations for Deepgram's Speech-to-Text (STT) and Text-to-Speech (TTS) services within LiveKit agents. Currently at version 1.5.2, it is part of the actively developed LiveKit Agents ecosystem, which sees frequent updates and new features.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize Deepgram STT and TTS plugins for use within a LiveKit Agent. It highlights the use of `deepgram.STT` and `deepgram.TTS` classes, alongside a VAD and LLM, typically within a `VoiceAssistant` or `AgentSession`. It's crucial to set the `DEEPGRAM_API_KEY` environment variable.

import os
import asyncio
from livekit.agents import llm, stt, tts, vad
from livekit.agents.voice_assistant import VoiceAssistant
from livekit.agents.utils import AudioStream
from livekit.plugins import deepgram, openai, silero

# Ensure Deepgram API key is set in environment variables or passed directly
os.environ['DEEPGRAM_API_KEY'] = os.environ.get('DEEPGRAM_API_KEY', 'your_deepgram_api_key_here')
os.environ['OPENAI_API_KEY'] = os.environ.get('OPENAI_API_KEY', 'your_openai_api_key_here')

async def main():
    # Example using Deepgram for STT and TTS, and OpenAI for LLM
    deepgram_stt = deepgram.STT(model='nova-2')
    deepgram_tts = deepgram.TTS(model='aura-2-asteria-en')
    openai_llm = openai.LLM(model='gpt-4o-mini')
    silero_vad = silero.VAD.get_default_vad()

    assistant = VoiceAssistant(
        stt=deepgram_stt,
        tts=deepgram_tts,
        llm=openai_llm,
        vad=silero_vad,
        context_timeout=15, # seconds
        interrupt_sensitivity=0.5,
    )

    print("VoiceAssistant initialized. You can now use deepgram_stt, deepgram_tts in your agent session.")
    # In a real agent, you would integrate this into an AgentSession
    # For example: AgentSession(llm=openai_llm, stt=deepgram_stt, tts=deepgram_tts, ...)
    
    # Simulate text-to-speech
    async for chunk in deepgram_tts.synthesize('Hello from LiveKit and Deepgram!'):
        if chunk.type == AudioStream.Type.ELEMENT:
            print(f"Received audio chunk: {len(chunk.data)} bytes")
        

if __name__ == '__main__':
    asyncio.run(main())

view raw JSON →