Ollama Python Library

0.6.1 verified Tue May 12 auth: no python install: verified quickstart: stale

Official Python client for Ollama — the local LLM runtime. Wraps the Ollama REST API with native Python types. Requires Ollama to be installed and running locally (or pointed at https://ollama.com for cloud models). Not an inference provider — it's a runtime client.

pip install ollama

Common errors

error ModuleNotFoundError: No module named 'ollama' ↓

cause The 'ollama' Python package has not been installed in your current environment or there is a misconfiguration in your Python path.

fix

Run pip install ollama to install the official Python client.

error httpx.ConnectError: [Errno 111] Connection refused ↓

cause The Ollama server is not running, is not accessible at the default address (localhost:11434), or a firewall is blocking the connection.

fix

Ensure the Ollama server is running locally by executing ollama serve in your terminal and that no firewall is blocking port 11434. If running in a container or on a different host, configure the OLLAMA_HOST environment variable or host parameter in the ollama.Client.

error 'model' not found ↓

cause The specified model has not been downloaded and pulled to your local Ollama instance, or the model name is incorrect.

fix

Pull the desired model using the Ollama CLI: ollama pull <model_name> (e.g., ollama pull llama3). You can list available models with ollama list.

error AttributeError: partially initialized module 'ollama' has no attribute 'chat' ↓

cause This error typically occurs due to a circular import, most commonly when your Python script file is named `ollama.py`, which conflicts with the installed `ollama` library.

fix

Rename your Python script file from ollama.py to something else (e.g., my_ollama_app.py).

Warnings

breaking pip install ollama installs only the Python client — not the Ollama runtime. If the runtime is not installed and running, all calls raise ConnectionError immediately. ↓

fix Install the Ollama runtime separately from https://ollama.com/download, then run 'ollama serve' (or the desktop app). Verify with: curl http://localhost:11434

breaking Models must be explicitly pulled before use. Calling chat() with a model that hasn't been pulled raises a ResponseError with status_code 404. ↓

fix Run 'ollama pull <model>' before use, or catch ResponseError and call ollama.pull(model) programmatically.

breaking Tool calling with stream=True is incomplete. When tools are provided, Ollama buffers the full response and returns it as a single block even with stream=True set — true incremental streaming does not occur. ↓

fix Use stream=False when tools are involved. Track https://github.com/ollama/ollama/issues/9084 for resolution.

gotcha pip install ollama-python installs a different, unofficial third-party package with a completely different API (ModelManagementAPI, GenerateAPI classes). It is not the official client. ↓

fix Use pip install ollama (no hyphen suffix). Official package is at pypi.org/project/ollama.

gotcha The Ollama Python library defaults to http://localhost:11434. To connect to a remote Ollama instance, you must pass host explicitly via Client(host='http://<remote>:11434'). The OLLAMA_HOST env var is respected by the runtime but NOT automatically picked up by the Python client. ↓

fix Pass host explicitly: Client(host=os.environ.get('OLLAMA_HOST', 'http://localhost:11434'))

gotcha The OpenAI-compatible endpoint is at http://localhost:11434/v1 — but this is a separate REST interface, not the native ollama Python library. Using openai SDK against this endpoint requires api_key='ollama' (any non-empty string) as a placeholder. ↓

fix from openai import OpenAI; client = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')

gotcha Cloud models (e.g. gpt-oss:120b-cloud, deepseek-v3.1:671b-cloud) require ollama signin and ollama pull before use. They are routed through ollama.com and require an API key when accessed via the cloud API endpoint directly. ↓

fix Run 'ollama signin' first. For direct cloud API access, set host='https://ollama.com' and pass Authorization header with OLLAMA_API_KEY.

Install compatibility verified last tested: 2026-05-12

python os / libc status wheel install import disk

3.10 alpine (musl) - - 0.78s 31.8M

3.10 slim (glibc) - - 0.55s 31M

3.11 alpine (musl) - - 1.12s 34.8M

3.11 slim (glibc) - - 0.90s 34M

3.12 alpine (musl) - - 1.17s 26.3M

3.12 slim (glibc) - - 1.13s 26M

3.13 alpine (musl) - - 0.81s 25.9M

3.13 slim (glibc) - - 0.82s 26M

3.9 alpine (musl) - - 0.70s 31.1M

3.9 slim (glibc) - - 0.65s 31M

Imports

chat

wrong

import ollama
ollama.chat('gemma3', 'Hello')

correct

from ollama import chat
response = chat(model='gemma3', messages=[{'role': 'user', 'content': 'Hello'}])
print(response.message.content)

messages must be a list of dicts with role and content keys. Model name must match an already-pulled model.

Client (remote/custom host)

wrong

from ollama import Client
client = Client()
client.chat('gemma3', 'Hello')

correct

from ollama import Client
client = Client(host='http://localhost:11434')
response = client.chat(model='gemma3', messages=[{'role': 'user', 'content': 'Hello'}])

Use Client() for custom host, headers, or timeouts. Default host is http://localhost:11434.

Quickstart stale last tested: 2026-05-12

Basic chat with a locally running model

from ollama import chat

response = chat(
    model='gemma3',
    messages=[{'role': 'user', 'content': 'Why is the sky blue?'}]
)
print(response.message.content)