PipeCat AI
PipeCat AI is an open-source framework designed for building real-time voice and multimodal AI assistants. It provides a modular pipeline architecture for integrating various services like Speech-to-Text (STT), Large Language Models (LLM), Text-to-Speech (TTS), and real-time transports (e.g., Daily.co). It's currently in pre-1.0 development, with frequent updates introducing new features and services.
Warnings
- gotcha As a pre-1.0 library (version 0.0.x), PipeCat AI's API is subject to frequent and potentially breaking changes without strict semantic versioning. Always review release notes when upgrading.
- gotcha Beginning with v0.0.104, PipeCat AI introduced support for strongly-typed objects instead of plain dictionaries for updating service settings at runtime (e.g., `STTUpdateSettingsFrame`). While dictionaries might still work for some settings, the new typed objects are the recommended and future-proof approach.
- gotcha In v0.0.107, the default behavior of `SyncParallelPipeline` for output frame ordering changed. It now defaults to arrival order. If your application relies on the order in which pipelines were defined, you must explicitly set `frame_order=FrameOrder.PIPELINE`.
- gotcha PipeCat AI is an orchestration framework; it does not provide built-in LLM, TTS, or STT services. Users must bring their own cloud service API keys (e.g., OpenAI, Azure, Deepgram, Google) and configure the respective PipeCat service wrappers.
Install
-
pip install pipecat-ai
Imports
- Pipeline
from pipecat.pipeline.pipeline import Pipeline
- PipelineRunner
from pipecat.pipeline.runner import PipelineRunner
- LLMService
from pipecat.services.llm import LLMService
- TTSService
from pipecat.services.tts import TTSService
- VADService
from pipecat.services.vad import VADService
- DailyTransport
from pipecat.transports.services.daily import DailyTransport
- DailyParams
from pipecat.transports.services.daily import DailyParams
- DailyTransportOptions
from pipecat.transports.services.daily import DailyTransportOptions
- AudioFrame
from pipecat.frames.frames import AudioFrame
- TextFrame
from pipecat.frames.frames import TextFrame
Quickstart
import asyncio
import os
from pipecat.frames.frames import AudioFrame, TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.services.vad import VADService
from pipecat.transports.services.daily import DailyParams, DailyTransport, DailyTransportOptions
from pipecat.services.llm import LLMService
from pipecat.services.tts import TTSService
async def main():
# Make sure to set environment variables for DAILY_URL and OPENAI_API_KEY
# e.g., export DAILY_URL="https://example.daily.co/YOUR_ROOM"
# export OPENAI_API_KEY="sk-proj-..."
daily_url = os.environ.get("DAILY_URL", "")
openai_api_key = os.environ.get("OPENAI_API_KEY", "")
if not daily_url or not openai_api_key:
print("Please set DAILY_URL and OPENAI_API_KEY environment variables.")
return
# Setup your services (Daily, VAD, LLM, TTS)
transport = DailyTransport(
daily_url,
DailyTransportOptions(
lang="en",
vad_enabled=True,
mic_enabled=True,
speaker_enabled=True,
vad_service=VADService(),
),
)
llm = LLMService(
api_key=openai_api_key,
model="gpt-4o",
)
tts = TTSService(
api_key=openai_api_key,
model="tts-1",
voice="alloy",
)
# Define your pipeline: User audio -> LLM text -> TTS audio -> Bot audio
pipeline = Pipeline([
transport.input(), # User input (audio) from Daily
llm, # LLM processes user text
tts, # TTS generates audio from LLM text
transport.output(), # Bot output (audio) to Daily
])
runner = PipelineRunner()
print("Starting PipeCat AI assistant. Join the Daily room specified by DAILY_URL.")
await runner.run(pipeline)
if __name__ == "__main__":
asyncio.run(main())