Cartesia Python API Library

3.0.2 · active · verified Tue Apr 14

The official Python library for the Cartesia API, providing convenient access to its REST API from any Python 3.9+ application. It focuses on AI voice capabilities such as Text-to-Speech (TTS), Speech-to-Text (STT), and building real-time voice agents. The library includes comprehensive type definitions, offers both synchronous and asynchronous clients, and supports websockets. It is under active development, with frequent updates and a current stable version of 3.0.2.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the Cartesia client using an API key from an environment variable and generate text-to-speech audio, saving it to a WAV file. It uses the `client.tts.generate` method for basic synchronous audio generation.

import os
from cartesia import Cartesia

# Ensure CARTESIA_API_KEY environment variable is set
client = Cartesia(
    api_key=os.environ.get("CARTESIA_API_KEY", "")
)

try:
    response = client.tts.generate(
        model_id="sonic-3", # Note: specific models may have deprecation warnings
        output_format={
            "container": "wav",
            "encoding": "pcm_f32le",
            "sample_rate": 44100,
        },
        transcript="I have to say that I'd rather stay awake when I'm asleep.",
        voice={
            "mode": "id",
            "id": "e07c00bc-4134-4eae-9ea4-1a55fb45746b", # Example voice ID
        },
    )

    # In a real application, you might stream this or handle it as bytes
    with open("cartesia_generated.wav", "wb") as f:
        for chunk in response.iter_bytes():
            f.write(chunk)
    print("Audio generated and saved to cartesia_generated.wav")

except Exception as e:
    print(f"An error occurred: {e}")
    print("Please ensure your CARTESIA_API_KEY is set and valid.")

view raw JSON →