Cerebras Cloud SDK

1.67.0 · active · verified Tue Mar 24

Official Python SDK for the Cerebras Cloud inference API. Provides access to ultra-fast LLM inference on Cerebras Wafer-Scale Engine hardware. OpenAI-compatible API surface. Generated with Stainless. Current version: 1.67.0 (Mar 2026). Requires Python 3.9+. Note: separate from cerebras-sdk (PyPI) which is a hardware kernel development tool — completely different product.

Warnings

Install

Imports

Quickstart

Minimal Cerebras inference call using cerebras-cloud-sdk 1.x.

# pip install cerebras-cloud-sdk
from cerebras.cloud.sdk import Cerebras
import os

client = Cerebras(
    api_key=os.environ.get('CEREBRAS_API_KEY')
)

response = client.chat.completions.create(
    model='llama3.1-8b',
    messages=[
        {'role': 'system', 'content': 'You are a helpful assistant.'},
        {'role': 'user', 'content': 'What is fast inference?'}
    ]
)
print(response.choices[0].message.content)

view raw JSON →