LlamaIndex Ollama LLM Integration

0.10.1 · active · verified Wed Apr 15

The `llama-index-llms-ollama` library provides an integration for LlamaIndex to utilize Large Language Models (LLMs) hosted locally via Ollama. It enables users to leverage various open-source models (like Llama, Mistral, Gemma, Phi-3, etc.) for tasks such as completions and chat within a LlamaIndex application, without relying on cloud-based LLM services. The current version is 0.10.1, released on March 20, 2026, and follows LlamaIndex's active and rapid release cadence for its integration packages.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the `Ollama` LLM and use it for both text completion and chat interactions within LlamaIndex. It assumes the Ollama server is running locally and that a model like `llama3.1` has been pulled using `ollama pull llama3.1`. The `request_timeout` is increased for robustness, and `context_window` is an optional parameter for memory management.

# First, ensure Ollama is installed and running, and pull a model:
# On your terminal:
# curl -fsSL https://ollama.com/install.sh | sh
# ollama serve
# ollama pull llama3.1

from llama_index.llms.ollama import Ollama
from llama_index.core.llms import ChatMessage
import os

# Initialize Ollama LLM. Adjust model and timeout as needed.
# Ensure the model 'llama3.1' is pulled via 'ollama pull llama3.1'
llm = Ollama(
    model="llama3.1:latest",
    request_timeout=120.0, # Increase timeout from default 30s if model is slow
    # context_window=8000 # Optionally set context window to limit memory usage
)

# Generate a completion
response_completion = llm.complete("Tell me a short story about a brave knight.")
print("\n--- Completion Response ---")
print(response_completion)

# Send a chat message
messages = [
    ChatMessage(role="system", content="You are a helpful assistant."),
    ChatMessage(role="user", content="What is the capital of France?")
]
response_chat = llm.chat(messages)
print("\n--- Chat Response ---")
print(response_chat.message.content)

view raw JSON →