txtai

raw JSON →
9.7.0 verified Tue May 12 auth: no python install: stale quickstart: stale

All-in-one AI framework: embeddings database, semantic search, LLM orchestration, RAG, pipelines and agents. Current version: 9.7.0 (Mar 2026). TWO packages on PyPI: 'txtai' (full local library) and 'txtai.py' (thin API client for remote txtai server). Most tutorials use the full 'txtai' package. Core API: Embeddings class. index() rebuilds entire index. upsert() adds/updates without full rebuild. Content storage must be enabled for SQL queries and content retrieval.

pip install txtai
error ModuleNotFoundError: No module named 'txtai'
cause The `txtai` library has not been installed in the current Python environment or the environment is not active.
fix
pip install txtai
error ValueError: content must be enabled to save content
cause The `Embeddings` index was initialized without enabling content storage, preventing content retrieval or SQL queries.
fix
Initialize Embeddings with content=True, e.g., Embeddings(config={'content': True}).
error OSError: Can't load tokenizer for 'sentence-transformers/all-MiniLM-L6-v2'. If you were trying to load it from 'https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2', make sure you don't have a local directory with the same name.
cause The specified `sentence-transformers` model cannot be loaded, possibly due to a network issue, a typo in the model name, insufficient disk space, or a corrupted local cache.
fix
Verify the model name and internet connectivity, ensure sufficient disk space, or clear the Hugging Face cache (usually ~/.cache/huggingface/hub) if a corrupted download is suspected.
error AttributeError: 'txtai.embeddings.Embeddings' object has no attribute 'add'
cause The `Embeddings` object in `txtai` does not have an `add` method; data is added using `index` or `upsert`.
fix
Use embeddings.index(data) to rebuild the index or embeddings.upsert(data) to add/update existing data.
error TypeError: 'str' object is not iterable
cause The `embeddings.index()` or `embeddings.upsert()` method expects the `data` argument to be a list of items, but a single string (or other non-iterable object) was provided.
fix
Wrap the input data in a list, even if it's a single item, e.g., embeddings.index(["text_item"]) or embeddings.upsert([("id1", "text_item", None)]).
breaking Two packages on PyPI: 'txtai' (full library) and 'txtai.py' (thin API client). They have different APIs. 'pip install txtai.py' installs a client that connects to a remote txtai server — not the local library.
fix For local use: pip install txtai. For connecting to a remote txtai API server: pip install txtai.py
breaking index() replaces the entire index. Calling it twice means the first index is gone. LLMs commonly generate code that calls index() multiple times to 'add' documents.
fix Use upsert() to add/update documents without full rebuild. Use index() only for initial load or full re-index.
breaking search() returns (id, score) tuples by default — not document text. Accessing result['text'] raises KeyError without content=True enabled.
fix Enable Embeddings(content=True) to store and retrieve text from search results.
gotcha SQL queries and content retrieval require content=True at index creation time. Cannot be enabled after index is built without re-indexing.
fix Always set content=True if you need SQL queries, text retrieval, or metadata filtering.
gotcha Base 'pip install txtai' has minimal deps. Most useful features (pipelines, LLM, API server) require extras: txtai[pipeline-text], txtai[api], txtai[all].
fix For RAG/LLM workflows: pip install txtai[all]. For just semantic search: pip install txtai[similarity].
gotcha Default model downloads from Hugging Face Hub on first use — requires internet access and ~100MB download. Fails in air-gapped environments.
fix Pre-download model: embeddings = Embeddings(path='/local/model/path'). Or set HF_HUB_OFFLINE=1 with a cached model.
gotcha Agents (added in v8) are built on smolagents framework — requires pip install txtai[agent]. Earlier versions used transformers agents which had different API.
fix pip install txtai[agent] for agent support.
breaking Installation of txtai (and its dependencies that require compilation, such as scikit-learn, hnswlib, annoy, fasttext) may fail in minimal environments (e.g., Alpine Linux) due to missing build tools. These packages often require a C/C++ compiler to build native extensions.
fix Install necessary build tools in the environment before attempting to install txtai. For Alpine Linux, this typically involves `apk add build-base python3-dev`.
pip install txtai[all]
pip install txtai[api]
pip install txtai.py
python os / libc variant status wheel install import disk
3.10 alpine (musl) txtai - - - -
3.10 alpine (musl) txtai.py - - - -
3.10 alpine (musl) all - - - -
3.10 alpine (musl) api - - - -
3.10 slim (glibc) txtai - - 20.05s 4.8G
3.10 slim (glibc) txtai.py - - - -
3.10 slim (glibc) all - - - -
3.10 slim (glibc) api - - - -
3.11 alpine (musl) txtai - - - -
3.11 alpine (musl) txtai.py - - - -
3.11 alpine (musl) all - - - -
3.11 alpine (musl) api - - - -
3.11 slim (glibc) txtai - - 22.76s 4.9G
3.11 slim (glibc) txtai.py - - - -
3.11 slim (glibc) all - - - -
3.11 slim (glibc) api - - 22.84s 5.0G
3.12 alpine (musl) txtai - - - -
3.12 alpine (musl) txtai.py - - - -
3.12 alpine (musl) all - - - -
3.12 alpine (musl) api - - - -
3.12 slim (glibc) txtai - - 24.01s 4.9G
3.12 slim (glibc) txtai.py - - - -
3.12 slim (glibc) all - - - -
3.12 slim (glibc) api - - 25.87s 5.0G
3.13 alpine (musl) txtai - - - -
3.13 alpine (musl) txtai.py - - - -
3.13 alpine (musl) all - - - -
3.13 alpine (musl) api - - - -
3.13 slim (glibc) txtai - - 21.43s 4.9G
3.13 slim (glibc) txtai.py - - - -
3.13 slim (glibc) all - - - -
3.13 slim (glibc) api - - 23.23s 5.0G
3.9 alpine (musl) txtai - - - -
3.9 alpine (musl) txtai.py - - - -
3.9 alpine (musl) all - - - -
3.9 alpine (musl) api - - - -
3.9 slim (glibc) txtai - - - -
3.9 slim (glibc) txtai.py - - - -
3.9 slim (glibc) all - - - -
3.9 slim (glibc) api - - - -

txtai Embeddings with content storage, search, upsert, save/load.

# pip install txtai
from txtai import Embeddings

# Create embeddings with content storage
embeddings = Embeddings(
    path='sentence-transformers/all-MiniLM-L6-v2',
    content=True
)

# Index documents
embeddings.index([
    {'id': 0, 'text': 'Python is a programming language created by Guido'},
    {'id': 1, 'text': 'JavaScript is used for web development'},
    {'id': 2, 'text': 'Rust provides memory safety without garbage collection'},
    {'id': 3, 'text': 'Go is designed for cloud infrastructure'},
])

# Semantic search — returns dicts with text
results = embeddings.search('systems programming language', 2)
for r in results:
    print(r['text'], r['score'])

# Upsert — add without rebuilding
embeddings.upsert([{'id': 4, 'text': 'TypeScript adds types to JavaScript'}])

# Save and load
embeddings.save('/tmp/myindex')
embeddings.load('/tmp/myindex')