{"id":9894,"library":"livekit-plugins-aws","title":"LiveKit Agents AWS Plugin","description":"livekit-plugins-aws is a plugin for LiveKit Agents, providing seamless integrations with Amazon Web Services for real-time voice applications. It enables the use of AWS services such as Amazon Polly for Text-to-Speech (TTS) and Amazon Transcribe for Speech-to-Text (STT) within the LiveKit Agent framework. This allows developers to build sophisticated voice AI agents leveraging AWS backend services. The library is currently at version 1.5.4 and is part of the `livekit/agents` monorepo, which features an active development and release cadence.","status":"active","version":"1.5.4","language":"en","source_language":"en","source_url":"https://github.com/livekit/agents","tags":["livekit","aws","stt","tts","ai","realtime","voice","transcribe","polly"],"install":[{"cmd":"pip install livekit-plugins-aws","lang":"bash","label":"Install livekit-plugins-aws"},{"cmd":"pip install livekit-agents[aws,openai]","lang":"bash","label":"Install LiveKit Agents with AWS and OpenAI extras"}],"dependencies":[{"reason":"Core library for building LiveKit agents, required for the plugin to function.","package":"livekit-agents","optional":false},{"reason":"AWS SDK for Python, used for interacting with AWS services like Polly and Transcribe.","package":"boto3","optional":false},{"reason":"Commonly used for the LLM component in LiveKit Agents, often installed via `livekit-agents[openai]`.","package":"openai","optional":true}],"imports":[{"symbol":"STT","correct":"from livekit.plugins import aws\naws_stt = aws.STT()"},{"symbol":"TTS","correct":"from livekit.plugins import aws\naws_tts = aws.TTS()"},{"note":"While available, the generic `livekit.agents.voice.VoiceActivityDetector` is often preferred or sufficient.","symbol":"VAD","correct":"from livekit.plugins import aws\naws_vad = aws.VAD()"}],"quickstart":{"code":"import asyncio\nimport os\nfrom livekit.agents import Agent, JobContext, WorkerOptions, cli\nfrom livekit.agents.llm import OpenAI\nfrom livekit.agents.voice import VoiceActivityDetector\nfrom livekit.plugins import aws\n\n# --- Environment Variables Needed ---\n# LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET\n# AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION (e.g., us-east-1)\n# OPENAI_API_KEY (if using OpenAI LLM)\n# ----------------------------------\n\nclass MyAWSVoiceAgent(Agent):\n    def __init__(self):\n        super().__init__()\n        # AWS STT and TTS pick up credentials and region from\n        # environment variables or default AWS config (~/.aws/credentials).\n        self.aws_stt = aws.STT() # Uses AWS Transcribe\n        self.aws_tts = aws.TTS() # Uses AWS Polly\n        self.openai_llm = OpenAI() # Example: Using OpenAI for LLM\n\n    async def _on_connected(self, ctx: JobContext):\n        print(f\"Agent connected to room: {ctx.room.name}\")\n        \n        # Initialize the agent session with AWS STT/TTS\n        session = ctx.get_agent_session(\n            llm=self.openai_llm,\n            tts=self.aws_tts,\n            stt=self.aws_stt,\n            vad=VoiceActivityDetector(), # Recommended for robust voice interaction\n            # preemptive_generation=False # Set to False if you want to disable the 1.5.0 default\n        )\n        \n        await session.start()\n        print(\"Agent session started with AWS STT/TTS. Waiting for user input...\")\n\n        async for turn in session.ai_turns():\n            if turn.text:\n                print(f\"User (via AWS Transcribe): {turn.text}\")\n                response = await self.openai_llm.generate_reply(turn.history)\n                \n                # Agent speaks response via AWS Polly\n                await turn.say(response.text)\n                print(f\"Agent (via AWS Polly): {response.text}\")\n\n        print(\"Agent session ended.\")\n\nif __name__ == \"__main__\":\n    cli.run_agent(MyAWSVoiceAgent(), WorkerOptions(\n        log_level=\"INFO\",\n        rtc_url=os.environ.get(\"LIVEKIT_URL\", \"ws://localhost:7880\"),\n        webrtc_url=os.environ.get(\"LIVEKIT_WEBRTC_URL\", \"http://localhost:7880\"),\n        api_key=os.environ.get(\"LIVEKIT_API_KEY\", \"\"),\n        api_secret=os.environ.get(\"LIVEKIT_API_SECRET\", \"\"),\n    ))\n","lang":"python","description":"This quickstart demonstrates how to set up a basic LiveKit Agent using AWS Transcribe for Speech-to-Text (STT) and AWS Polly for Text-to-Speech (TTS). It assumes you have LiveKit server credentials and AWS credentials configured via environment variables (e.g., `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION`). The agent uses an OpenAI LLM (ensure `OPENAI_API_KEY` is set) to generate responses, which are then spoken back using AWS Polly. Run this with `python your_script_name.py` after setting up environment variables."},"warnings":[{"fix":"To revert to the previous behavior (generating only after a full user turn), initialize `AgentSession(preemptive_generation=False)`. For fine-grained control, explore `livekit.agents.PreemptiveGenerationOptions`.","message":"LiveKit Agents version 1.5.0 and above enable 'preemptive generation' by default. This changes when LLM/TTS inference starts, potentially impacting latency, cost, and conversational flow. This was refined further in 1.5.4.","severity":"breaking","affected_versions":"1.5.0+"},{"fix":"Ensure `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and `AWS_REGION` environment variables are set, or that your `~/.aws/credentials` file is properly configured. Verify that the IAM user/role has permissions for `polly:SynthesizeSpeech` and `transcribe:StartStreamTranscription`.","message":"AWS credentials must be correctly configured for `livekit-plugins-aws` to access services like Polly and Transcribe. Common issues include missing environment variables or incorrect IAM permissions.","severity":"gotcha","affected_versions":"All"},{"fix":"Always install `livekit-plugins-aws` directly or use the `livekit-agents[aws]` extra. Ensure all `livekit` packages are kept up to date (`pip install --upgrade livekit-agents livekit-plugins-aws`).","message":"The `livekit-plugins-aws` package is part of the `livekit/agents` monorepo. It tightly couples with `livekit-agents` and `boto3`. Version mismatches can lead to unexpected behavior or `ModuleNotFoundError`.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Install the AWS plugin using `pip install livekit-plugins-aws` or by installing `livekit-agents` with the `aws` extra: `pip install livekit-agents[aws]`.","cause":"The `boto3` AWS SDK for Python, which `livekit-plugins-aws` depends on, is not installed in your environment.","error":"ModuleNotFoundError: No module named 'boto3'"},{"fix":"Verify your `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and `AWS_REGION` environment variables. Ensure they are correct and your IAM user/role has the necessary permissions. Also check `~/.aws/credentials` if using a profile.","cause":"Your AWS credentials (access key, secret key, or session token) are incorrect, missing, or have expired. This prevents authentication with AWS services.","error":"botocore.exceptions.ClientError: An error occurred (InvalidClientTokenId) when calling the GetCallerIdentity operation: The security token included in the request is invalid."},{"fix":"If this behavior is not desired, you can disable it by passing `preemptive_generation=False` to your `ctx.get_agent_session()` call. Example: `session = ctx.get_agent_session(..., preemptive_generation=False)`.","cause":"LiveKit Agents versions 1.5.0+ enable 'preemptive generation' by default, meaning LLM and TTS inference may start before the user has finished speaking, increasing concurrency and potentially costs.","error":"Agent responds too quickly/prematurely, or costs for TTS/STT are higher than expected after upgrade."}]}