LlamaIndex Ollama Embeddings

0.9.0 · active · verified Fri Apr 17

The `llama-index-embeddings-ollama` library provides an integration for LlamaIndex to use embedding models served via Ollama. It allows developers to leverage local or self-hosted open-source models for text embeddings within their LlamaIndex applications. The current version is `0.9.0` and it generally follows the release cadence of the main `llama-index-core` library, receiving frequent updates.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the `OllamaEmbedding` model and generate embeddings for single texts and batches. It includes checks for common setup issues like the Ollama server not running or models not being pulled.

import os
from llama_index.embeddings.ollama import OllamaEmbedding

# Pre-requisites:
# 1. Ensure the Ollama server is running locally: `ollama serve` in your terminal.
# 2. Pull the desired embedding model: `ollama pull llama2` (or another model like `nomic-embed-text`)

try:
    # Initialize the Ollama Embedding model
    # Specify the model_name (must be pulled via Ollama) and optionally base_url
    embed_model = OllamaEmbedding(
        model_name="llama2", # Make sure this model is pulled
        base_url="http://localhost:11434"
    )

    # Get an embedding for a piece of text
    text_to_embed = "This is an example sentence for LlamaIndex with Ollama embeddings."
    embedding_vector = embed_model.get_text_embedding(text_to_embed)

    print(f"Successfully generated embedding using OllamaEmbedding.")
    print(f"Embedding vector length: {len(embedding_vector)}")
    print(f"First 10 dimensions: {embedding_vector[:10]}")

    # You can also embed multiple texts in a batch
    texts_to_embed_batch = [
        "LlamaIndex helps build LLM applications.",
        "Ollama runs large language models locally and efficiently.",
    ]
    embedding_vectors_batch = embed_model.get_text_embedding_batch(texts_to_embed_batch)
    print(f"\nGenerated {len(embedding_vectors_batch)} embeddings in batch.")
    print(f"Length of first batch embedding: {len(embedding_vectors_batch[0])}")

except Exception as e:
    print(f"An error occurred: {e}")
    if "Connection refused" in str(e) or "Failed to connect to Ollama" in str(e):
        print("Hint: Ensure the Ollama server is running (run `ollama serve`).")
    elif "model not found" in str(e) or "no such model" in str(e):
        print(f"Hint: Ensure the model '{embed_model.model_name}' is pulled (run `ollama pull {embed_model.model_name}`).")

view raw JSON →