Cohere Python SDK

raw JSON →
5.20.4 verified Mon May 11 auth: no python install: stale quickstart: verified

Official Python SDK for Cohere API. Has two coexisting client classes: Client (v1, legacy) and ClientV2 (v2, current). Most LLM-generated code uses v1 patterns. The Generate endpoint is deprecated. model is now required in all v2 calls.

pip install cohere
error ModuleNotFoundError: No module named 'cohere'
cause The 'cohere' Python package is not installed in your current environment.
fix
pip install cohere
error cohere.core.api_error.ApiError: headers: {'content-type': 'application/json'}, status_code: 401, body: {'message': 'invalid api token'}
cause The API key provided for authentication is either incorrect, expired, or malformed.
fix
Ensure your CO_API_KEY environment variable is set with a valid, active Cohere API key, or pass the correct key directly to cohere.ClientV2(api_key='YOUR_API_KEY').
error AttributeError: 'Client' object has no attribute 'generate'
cause You are attempting to use the deprecated `generate` endpoint with the older `cohere.Client` class, or calling `generate` directly on `ClientV2` which primarily uses `chat`.
fix
Migrate to cohere.ClientV2 and use the chat endpoint for generative tasks, or other specific V2 endpoints like embed or rerank.
import cohere
co = cohere.ClientV2()
response = co.chat(
    model="command-r-plus",
    messages=[{"role": "user", "content": "Hello world!"}],
)
print(response)
error cohere.core.api_error.ApiError: headers: {'content-type': 'application/json'}, status_code: 400, body: {'message': 'a model parameter is required for this endpoint'}
cause In Cohere V2 API calls, the 'model' parameter is now explicitly required for most endpoints like `chat`, `embed`, and `rerank`.
fix
Always include the model parameter with a valid model name (e.g., 'command-r-plus', 'embed-english-v3.0') in your API calls.
import cohere
co = cohere.ClientV2()
response = co.embed(
    model="embed-english-v3.0",
    texts=["Hello world"],
    input_type="search_document",
)
print(response)
error AttributeError: module 'cohere' has no attribute 'AsyncClient'
cause The top-level `cohere` module in the Cohere Python SDK does not directly expose an `AsyncClient` class as an attribute, although asynchronous operations are supported through methods on `ClientV2`.
fix
If integrating with libraries that expect cohere.AsyncClient (like older versions of LangChain), you might need to adjust the integration layer or ensure you are using a compatible version of the wrapper library. The Cohere SDK itself primarily uses the ClientV2 methods for async operations (e.g., await co.chat_stream(...)).
breaking Generate endpoint (co.generate()) deprecated as of Aug 26, 2025. Calls will fail after sunset.
fix Migrate to co.chat() with ClientV2. See https://docs.cohere.com/v1/docs/migrating-from-cogenerate-to-cochat
breaking model is a required parameter in all v2 API calls. Omitting it raises an error. v1 had a default model fallback.
fix Always pass model= explicitly. Current recommended: command-a-03-2025
breaking preamble parameter removed in v2. System instructions must be passed as a system role message.
fix Replace preamble='...' with messages=[{'role': 'system', 'content': '...'}, ...]
breaking conversation_id removed in v2. Cohere no longer manages chat history server-side.
fix Manage conversation history client-side by passing full messages array each turn
breaking connectors (built-in RAG) removed in v2. Web search now requires passing a web search tool explicitly.
fix Implement web search as a tool_use pattern. See v2 migration guide.
breaking rerank v2 deprecated. Use rerank v3.5.
fix Update rerank calls to use v3.5 endpoint. Check SDK release notes for exact method changes.
gotcha Response text access changed between v1 and v2. v1: response.text. v2: response.message.content[0].text. Mixing clients and response patterns is the most common error.
fix Check which client you instantiated (Client vs ClientV2) and use the matching response access pattern.
gotcha num_generations, stop_sequences, logit_bias, and truncate parameters from Generate are not supported in Chat. No direct equivalent for num_generations.
fix For num_generations=n, call co.chat() n times. Trim outputs on your side instead of stop_sequences.
gotcha CO_API_KEY is the expected environment variable name for ClientV2. Not COHERE_API_KEY. Failing to set CO_API_KEY or using the wrong variable name can lead to explicit instantiation errors or authentication failures.
fix Set export CO_API_KEY=your_key. ClientV2() reads this automatically.
uv add cohere
python os / libc status wheel install import disk
3.10 alpine (musl) - - - -
3.10 slim (glibc) - - - -
3.11 alpine (musl) - - - -
3.11 slim (glibc) - - - -
3.12 alpine (musl) - - - -
3.12 slim (glibc) - - - -
3.13 alpine (musl) - - - -
3.13 slim (glibc) - - - -
3.9 alpine (musl) - - - -
3.9 slim (glibc) - - - -

Minimal chat completion using v2 client

import cohere

co = cohere.ClientV2()  # reads CO_API_KEY env var

response = co.chat(
    model='command-a-03-2025',
    messages=[{'role': 'user', 'content': 'Hello'}]
)
print(response.message.content[0].text)