HuggingFace Embeddings for LlamaIndex
This library integrates HuggingFace embedding models, including Sentence Transformer models, with LlamaIndex. It allows users to create embeddings for documents and queries for retrieval, supporting models like BGE, Mixedbread, Nomic, Jina, and E5. The current version is 0.7.0, and it's part of the LlamaIndex ecosystem, which maintains a regular release cadence for its integration packages.
Warnings
- gotcha The `sentence-transformers` package is a required peer dependency and must be installed separately. Failing to install it will lead to runtime errors when attempting to use `HuggingFaceEmbedding`.
- deprecated Several parameters for `HuggingFaceEmbedding` such as `tokenizer_name`, `pooling`, `model`, and `tokenizer` are marked as deprecated in recent versions. Relying on them may lead to warnings or future breaking changes.
- breaking Python 3.9 is no longer supported. This library explicitly requires Python 3.10 or newer.
- gotcha For advanced features like ONNX or OpenVINO model inference, additional 'extra' installations for `sentence-transformers` are required (e.g., `pip install sentence-transformers[onnx]` or `pip install sentence-transformers[openvino]`).
- gotcha This library has a specific dependency range on `llama-index-core` (e.g., `>=0.13.0,<0.15` as of version 0.7.0). Using an incompatible `llama-index-core` version can lead to unexpected behavior or errors.
Install
-
pip install llama-index-embeddings-huggingface -
pip install sentence-transformers
Imports
- HuggingFaceEmbedding
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
Quickstart
from llama_index.core import Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
# Initialize the HuggingFaceEmbedding model
# Loads BAAI/bge-small-en-v1.5 with the default torch backend
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
# Option 1: Set as the global embedding model for LlamaIndex
Settings.embed_model = embed_model
# Option 2: Generate embeddings for text directly
text_to_embed = "Hello World! This is a test sentence."
embeddings = embed_model.get_text_embedding(text_to_embed)
print(f"Embeddings length: {len(embeddings)}")
print(f"First 5 embedding values: {embeddings[:5]}")