Cohere Python SDK
Official Python SDK for Cohere API. Has two coexisting client classes: Client (v1, legacy) and ClientV2 (v2, current). Most LLM-generated code uses v1 patterns. The Generate endpoint is deprecated. model is now required in all v2 calls.
Warnings
- breaking Generate endpoint (co.generate()) deprecated as of Aug 26, 2025. Calls will fail after sunset.
- breaking model is a required parameter in all v2 API calls. Omitting it raises an error. v1 had a default model fallback.
- breaking preamble parameter removed in v2. System instructions must be passed as a system role message.
- breaking conversation_id removed in v2. Cohere no longer manages chat history server-side.
- breaking connectors (built-in RAG) removed in v2. Web search now requires passing a web search tool explicitly.
- breaking rerank v2 deprecated. Use rerank v3.5.
- gotcha Response text access changed between v1 and v2. v1: response.text. v2: response.message.content[0].text. Mixing clients and response patterns is the most common error.
- gotcha num_generations, stop_sequences, logit_bias, and truncate parameters from Generate are not supported in Chat. No direct equivalent for num_generations.
- gotcha CO_API_KEY is the expected environment variable name. Not COHERE_API_KEY. Using the wrong var name results in silent auth failure.
Install
-
pip install cohere -
uv add cohere
Imports
- ClientV2
import cohere; co = cohere.ClientV2()
- response text (v2)
response.message.content[0].text
- system prompt (v2)
messages=[{'role': 'system', 'content': '...'}, {'role': 'user', 'content': '...'}] - AsyncClientV2
import cohere; co = cohere.AsyncClientV2()
Quickstart
import cohere
co = cohere.ClientV2() # reads CO_API_KEY env var
response = co.chat(
model='command-a-03-2025',
messages=[{'role': 'user', 'content': 'Hello'}]
)
print(response.message.content[0].text)