LlamaIndex Ollama Embeddings
The `llama-index-embeddings-ollama` library provides an integration for LlamaIndex to use embedding models served via Ollama. It allows developers to leverage local or self-hosted open-source models for text embeddings within their LlamaIndex applications. The current version is `0.9.0` and it generally follows the release cadence of the main `llama-index-core` library, receiving frequent updates.
Common errors
-
ollama.exceptions.OllamaConnectionError: Failed to connect to Ollama: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embed (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x...>: Failed to establish a new connection: [Errno 61] Connection refused'))cause The Ollama server is not running or is not accessible at the default address (`http://localhost:11434`).fixStart the Ollama server by running `ollama serve` in your terminal. If it's running on a different port or host, specify it in `OllamaEmbedding(base_url='http://<host>:<port>')`. -
ollama.exceptions.OllamaError: model 'non-existent-model' not found, try `ollama pull non-existent-model`
cause The embedding model specified in `model_name` (e.g., 'non-existent-model') has not been pulled to your local Ollama instance.fixPull the required model using the Ollama CLI: `ollama pull <model_name>` (e.g., `ollama pull llama2` or `ollama pull nomic-embed-text`). -
AttributeError: module 'llama_index.llms.ollama' has no attribute 'OllamaEmbedding'
cause Attempting to import `OllamaEmbedding` from the LlamaIndex LLM (Large Language Model) module for Ollama, instead of the dedicated Embedding module.fixCorrect the import statement to `from llama_index.embeddings.ollama import OllamaEmbedding`.
Warnings
- gotcha The Ollama server must be running and accessible at the specified `base_url` (defaulting to `http://localhost:11434`). You typically start it with `ollama serve`.
- gotcha The specified `model_name` for `OllamaEmbedding` must have been pulled and available in your local Ollama instance (e.g., `llama2`, `nomic-embed-text`).
- breaking Major version updates of `llama-index-core` (e.g., from 0.9.x to 0.10.x, or 0.10.x to 0.11.x) often introduce breaking changes that may require updating `llama-index-embeddings-ollama` to a compatible version.
- gotcha This package requires Python version `3.10` or higher, but less than `4.0`.
Install
-
pip install llama-index-embeddings-ollama
Imports
- OllamaEmbedding
from llama_index.llms.ollama import OllamaEmbedding
from llama_index.embeddings.ollama import OllamaEmbedding
- OllamaEmbedding
from llama_index.core.embeddings.ollama import OllamaEmbedding
from llama_index.embeddings.ollama import OllamaEmbedding
Quickstart
import os
from llama_index.embeddings.ollama import OllamaEmbedding
# Pre-requisites:
# 1. Ensure the Ollama server is running locally: `ollama serve` in your terminal.
# 2. Pull the desired embedding model: `ollama pull llama2` (or another model like `nomic-embed-text`)
try:
# Initialize the Ollama Embedding model
# Specify the model_name (must be pulled via Ollama) and optionally base_url
embed_model = OllamaEmbedding(
model_name="llama2", # Make sure this model is pulled
base_url="http://localhost:11434"
)
# Get an embedding for a piece of text
text_to_embed = "This is an example sentence for LlamaIndex with Ollama embeddings."
embedding_vector = embed_model.get_text_embedding(text_to_embed)
print(f"Successfully generated embedding using OllamaEmbedding.")
print(f"Embedding vector length: {len(embedding_vector)}")
print(f"First 10 dimensions: {embedding_vector[:10]}")
# You can also embed multiple texts in a batch
texts_to_embed_batch = [
"LlamaIndex helps build LLM applications.",
"Ollama runs large language models locally and efficiently.",
]
embedding_vectors_batch = embed_model.get_text_embedding_batch(texts_to_embed_batch)
print(f"\nGenerated {len(embedding_vectors_batch)} embeddings in batch.")
print(f"Length of first batch embedding: {len(embedding_vectors_batch[0])}")
except Exception as e:
print(f"An error occurred: {e}")
if "Connection refused" in str(e) or "Failed to connect to Ollama" in str(e):
print("Hint: Ensure the Ollama server is running (run `ollama serve`).")
elif "model not found" in str(e) or "no such model" in str(e):
print(f"Hint: Ensure the model '{embed_model.model_name}' is pulled (run `ollama pull {embed_model.model_name}`).")