HuggingFace Embeddings for LlamaIndex

0.7.0 · active · verified Tue Apr 14

This library integrates HuggingFace embedding models, including Sentence Transformer models, with LlamaIndex. It allows users to create embeddings for documents and queries for retrieval, supporting models like BGE, Mixedbread, Nomic, Jina, and E5. The current version is 0.7.0, and it's part of the LlamaIndex ecosystem, which maintains a regular release cadence for its integration packages.

Warnings

Install

Imports

Quickstart

This example demonstrates how to initialize the `HuggingFaceEmbedding` class with a specified model (e.g., 'BAAI/bge-small-en-v1.5') and use it to either set the global embedding model for LlamaIndex or generate embeddings for a single piece of text directly.

from llama_index.core import Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

# Initialize the HuggingFaceEmbedding model
# Loads BAAI/bge-small-en-v1.5 with the default torch backend
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

# Option 1: Set as the global embedding model for LlamaIndex
Settings.embed_model = embed_model

# Option 2: Generate embeddings for text directly
text_to_embed = "Hello World! This is a test sentence."
embeddings = embed_model.get_text_embedding(text_to_embed)

print(f"Embeddings length: {len(embeddings)}")
print(f"First 5 embedding values: {embeddings[:5]}")

view raw JSON →