Sentence Transformers
raw JSON → 5.3.0 verified Tue May 12 auth: no python install: stale quickstart: stale
Framework for computing dense sentence/text/image embeddings using transformer models. Primary use cases: semantic search, semantic similarity, clustering, and reranking. Wraps transformers and provides SentenceTransformer (embedding), CrossEncoder (reranker), and SparseEncoder (sparse embedding) classes. 15,000+ pretrained models on HF Hub. Now officially maintained by Hugging Face (Tom Aarsen) after transfer from UKP Lab/TU Darmstadt. Package name: sentence-transformers (hyphen). Import name: sentence_transformers (underscore).
pip install sentence-transformers Common errors
error ModuleNotFoundError: No module named 'sentence_transformers' ↓
cause The `sentence-transformers` package is not installed in the current Python environment.
fix
pip install sentence-transformers
error TypeError: 'str' object cannot be interpreted as an integer ↓
cause The `model.encode()` method was provided with a single string instead of an expected list of strings.
fix
Wrap the single input string in a list, e.g.,
model.encode(["your single sentence here"]). error RuntimeError: CUDA out of memory. Tried to allocate X GiB (GPU X; X GiB total capacity; X GiB already allocated; X GiB free; X GiB reserved in total by PyTorch) ↓
cause The GPU does not have enough memory to process the current batch size or model, or other processes are consuming GPU memory.
fix
Reduce the
batch_size parameter in model.encode(), use a smaller model, or explicitly move the model to CPU (model.to('cpu')). error OSError: Can't load tokenizer for 'model-name'. If you were trying to load a tokenizer from a checkpoint saved by `save_pretrained`, make sure that 'model-name' is the path to a directory containing files saved by `save_pretrained`. ↓
cause The specified model name is incorrect, unavailable on Hugging Face Hub, or a network issue prevented its download.
fix
Verify the model name for typos, ensure a stable internet connection, or try a different, well-known model from the Hugging Face Hub.
Warnings
breaking Python 3.10+ required as of sentence-transformers 5.0. Python 3.9 and below will fail to install. ↓
fix Upgrade to Python 3.10+. For Python 3.9, pin: pip install sentence-transformers<5.
breaking sentence-transformers 5.2.2+ dropped the requests dependency in favor of optional httpx, aligning with transformers v5. Code that relied on requests being transitively installed via sentence-transformers may see ImportError on requests. ↓
fix If your code uses requests directly, add it explicitly: pip install requests.
breaking Training with sentence-transformers 5.x requires pinning to a compatible transformers version. sentence-transformers 5.2.3 introduced a compatibility fix for transformers v5.2 Trainer changes. Older sentence-transformers 5.x with transformers v5.2 causes training failures at the logging step. ↓
fix Upgrade to sentence-transformers>=5.2.3 when using transformers>=5.2.
gotcha encode() returns numpy float32 arrays by default, not torch tensors. Passing embeddings directly to PyTorch operations without converting first causes TypeError. Many tutorials omit this. ↓
fix Use convert_to_tensor=True to get torch.Tensor, or call torch.tensor(embeddings) manually.
gotcha CrossEncoder and SentenceTransformer are architecturally different and not interchangeable. CrossEncoder scores (query, doc) pairs — it cannot encode a corpus of documents independently. Using CrossEncoder where SentenceTransformer is needed produces wrong results with no error. ↓
fix Use SentenceTransformer for bi-encoder embedding (fast, scalable). Use CrossEncoder for reranking a small candidate set (slow, higher accuracy).
gotcha util.cos_sim() returns values in [-1, 1]. It does NOT return [0, 1]. Thresholding at 0.5 as a "similarity cutoff" is a common mistake — the actual meaningful threshold depends on the model and task. ↓
fix Calibrate thresholds empirically for your specific model and domain. For all-MiniLM-L6-v2, 0.3+ is often a reasonable rough threshold for semantic similarity.
gotcha Package name is sentence-transformers (hyphen) but import name is sentence_transformers (underscore). import sentence-transformers raises SyntaxError. from sentence-transformers import ... also fails. ↓
fix pip install sentence-transformers (hyphen). from sentence_transformers import SentenceTransformer (underscore).
breaking Installation of sentence-transformers will fail because its core dependency, PyTorch (torch), currently lacks official binary wheels for Python 3.13, especially on Alpine Linux. PyTorch wheels are often not released for the newest Python versions immediately and may not support musl libc distributions. ↓
fix Use a Python version for which PyTorch wheels are readily available (e.g., Python 3.10-3.12). Consider using a glibc-based Linux distribution (like Debian or Ubuntu) instead of Alpine if pre-built PyTorch wheels are required, or attempt to build PyTorch from source (a complex and time-consuming process).
Install
pip install sentence-transformers[train] pip install sentence-transformers[onnx] pip install sentence-transformers[onnx-gpu] pip install sentence-transformers[openvino] Install compatibility stale last tested: 2026-05-12
python os / libc variant status wheel install import disk
3.10 alpine (musl) sentence-transformers - - - -
3.10 alpine (musl) onnx-gpu - - - -
3.10 alpine (musl) onnx - - - -
3.10 alpine (musl) openvino - - - -
3.10 alpine (musl) train - - - -
3.10 slim (glibc) sentence-transformers - - 18.51s 4.9G
3.10 slim (glibc) onnx-gpu - - - -
3.10 slim (glibc) onnx - - - -
3.10 slim (glibc) openvino - - - -
3.10 slim (glibc) train - - - -
3.11 alpine (musl) sentence-transformers - - - -
3.11 alpine (musl) onnx-gpu - - - -
3.11 alpine (musl) onnx - - - -
3.11 alpine (musl) openvino - - - -
3.11 alpine (musl) train - - - -
3.11 slim (glibc) sentence-transformers - - 23.47s 5.0G
3.11 slim (glibc) onnx-gpu - - - -
3.11 slim (glibc) onnx - - 23.53s 5.2G
3.11 slim (glibc) openvino - - 24.28s 5.3G
3.11 slim (glibc) train - - - -
3.12 alpine (musl) sentence-transformers - - - -
3.12 alpine (musl) onnx-gpu - - - -
3.12 alpine (musl) onnx - - - -
3.12 alpine (musl) openvino - - - -
3.12 alpine (musl) train - - - -
3.12 slim (glibc) sentence-transformers - - 24.68s 5.0G
3.12 slim (glibc) onnx-gpu - - 24.66s 5.5G
3.12 slim (glibc) onnx - - 24.27s 5.2G
3.12 slim (glibc) openvino - - 24.54s 5.3G
3.12 slim (glibc) train - - - -
3.13 alpine (musl) sentence-transformers - - - -
3.13 alpine (musl) onnx-gpu - - - -
3.13 alpine (musl) onnx - - - -
3.13 alpine (musl) openvino - - - -
3.13 alpine (musl) train - - - -
3.13 slim (glibc) sentence-transformers - - 21.36s 5.0G
3.13 slim (glibc) onnx-gpu - - 21.06s 5.5G
3.13 slim (glibc) onnx - - 20.75s 5.2G
3.13 slim (glibc) openvino - - 20.81s 5.3G
3.13 slim (glibc) train - - 25.69s 5.3G
3.9 alpine (musl) sentence-transformers - - - -
3.9 alpine (musl) onnx-gpu - - - -
3.9 alpine (musl) onnx - - - -
3.9 alpine (musl) openvino - - - -
3.9 alpine (musl) train - - - -
3.9 slim (glibc) sentence-transformers - - - -
3.9 slim (glibc) onnx-gpu - - - -
3.9 slim (glibc) onnx - - - -
3.9 slim (glibc) openvino - - - -
3.9 slim (glibc) train - - - -
Imports
- SentenceTransformer wrong
import sentence-transformerscorrectfrom sentence_transformers import SentenceTransformer - CrossEncoder
from sentence_transformers import CrossEncoder - util.cos_sim
from sentence_transformers import util
Quickstart stale last tested: 2026-05-12
from sentence_transformers import SentenceTransformer, util
import numpy as np
# Load model (downloads on first use, ~90MB for MiniLM)
model = SentenceTransformer("all-MiniLM-L6-v2")
# Encode sentences → numpy float32 arrays by default
sentences = [
"The cat sat on the mat.",
"A feline rested on a rug.",
"The stock market crashed today."
]
embeddings = model.encode(sentences) # shape: (3, 384)
print(embeddings.shape)
# Cosine similarity
cosine_scores = util.cos_sim(embeddings[0], embeddings[1:])
print(cosine_scores) # similar pair scores higher
# Return torch tensors instead of numpy
embeddings_tensor = model.encode(sentences, convert_to_tensor=True)
# Semantic search
query_embedding = model.encode("Where did the cat sleep?", convert_to_tensor=True)
hits = util.semantic_search(query_embedding, embeddings_tensor, top_k=2)
print(hits)