Cartesia Python API Library
The official Python library for the Cartesia API, providing convenient access to its REST API from any Python 3.9+ application. It focuses on AI voice capabilities such as Text-to-Speech (TTS), Speech-to-Text (STT), and building real-time voice agents. The library includes comprehensive type definitions, offers both synchronous and asynchronous clients, and supports websockets. It is under active development, with frequent updates and a current stable version of 3.0.2.
Warnings
- breaking Specific older text-to-speech models and snapshots (e.g., 'sonic', 'sonic-english', certain 'sonic-2' snapshots) are scheduled for deprecation and discontinuation effective June 1, 2026. Using these models after this date will fail.
- breaking When using Pro Voice Cloning (PVC) models, routing now requires dated model IDs (e.g., `sonic-3-2026-01-12`) instead of generic IDs (`sonic-3`).
- gotcha API key usage differs for client-side vs. server-side applications. For client-side contexts (e.g., web apps), use Access Tokens for enhanced security to avoid exposing your API key. For trusted server-side applications, local scripts, or notebooks, direct API key usage is acceptable.
- breaking The format for API error responses changed with `Cartesia-Version: 2026-03-01`. Newer requests return structured JSON errors, while older API versions (before 2026-03-01) return legacy error formats (e.g., HTTP Title: Message).
- gotcha While the 3.x series of the Python SDK maintains backwards compatibility with method signatures from 2.x, some new helper functions or improved patterns might require minor code adjustments to leverage the latest features and best practices.
Install
-
pip install cartesia -
pip install 'cartesia[websockets]'
Imports
- Cartesia
from cartesia import Cartesia
Quickstart
import os
from cartesia import Cartesia
# Ensure CARTESIA_API_KEY environment variable is set
client = Cartesia(
api_key=os.environ.get("CARTESIA_API_KEY", "")
)
try:
response = client.tts.generate(
model_id="sonic-3", # Note: specific models may have deprecation warnings
output_format={
"container": "wav",
"encoding": "pcm_f32le",
"sample_rate": 44100,
},
transcript="I have to say that I'd rather stay awake when I'm asleep.",
voice={
"mode": "id",
"id": "e07c00bc-4134-4eae-9ea4-1a55fb45746b", # Example voice ID
},
)
# In a real application, you might stream this or handle it as bytes
with open("cartesia_generated.wav", "wb") as f:
for chunk in response.iter_bytes():
f.write(chunk)
print("Audio generated and saved to cartesia_generated.wav")
except Exception as e:
print(f"An error occurred: {e}")
print("Please ensure your CARTESIA_API_KEY is set and valid.")