LiveKit Google Cloud Plugins
livekit-plugins-google provides Speech-to-Text (STT) and Text-to-Speech (TTS) capabilities using Google Cloud services for the LiveKit Agent Framework. It enables agents to transcribe audio and generate spoken responses using Google's robust AI models. The current version is 1.5.2, aligning with the `livekit-agents` monorepo's release cadence, which typically sees frequent updates.
Warnings
- gotcha Authentication for Google Cloud services is crucial. This library relies on the standard `google-cloud-sdk` authentication flow. Ensure `GOOGLE_APPLICATION_CREDENTIALS` is set to a service account key file or `gcloud auth application-default login` has been run.
- breaking This plugin's version is tightly coupled with `livekit-agents`. Major `livekit-agents` updates (e.g., from 0.x to 1.x) may introduce breaking API changes that require corresponding updates to `livekit-plugins-google` and your agent code.
- gotcha The `livekit-agents` ecosystem, including plugins, is under active development. While API stability is a goal, expect rapid iteration and potential for minor API adjustments in patch or minor releases, especially for new features.
Install
-
pip install livekit-plugins-google
Imports
- STT
from livekit.plugins.google import STT
- TTS
from livekit.plugins.google import TTS
Quickstart
import asyncio
from livekit.plugins.google import STT, TTS
async def main():
# Google Cloud authentication is typically handled via environment variables
# such as GOOGLE_APPLICATION_CREDENTIALS or gcloud CLI configuration.
# Ensure your environment is set up for Google Cloud authentication.
# Instantiate Google Speech-to-Text
try:
google_stt = STT()
print(f"Google STT initialized: {google_stt.name}")
# Instantiate Google Text-to-Speech
google_tts = TTS()
print(f"Google TTS initialized: {google_tts.name}")
# Example of using TTS (simplified, in a real agent context it handles audio streams)
# Note: This quickstart primarily demonstrates initialization.
# Actual usage involves passing audio frames to STT and receiving audio frames from TTS.
text_to_speak = "Hello, this is a test from LiveKit Google TTS."
# In a real scenario, this would generate an audio iterator
# audio_iterator = await google_tts.synthesize(text_to_speak)
print(f"Would synthesize: '{text_to_speak}'")
except Exception as e:
print(f"Failed to initialize Google plugins. Ensure Google Cloud SDK is configured and credentials are set: {e}")
if __name__ == "__main__":
asyncio.run(main())