LanceDB
raw JSON → 0.29.2 verified Tue May 12 auth: no python install: stale quickstart: stale
Embedded, serverless vector database built on the Lance columnar format (Apache Arrow-based). Runs in-process with no separate server required — data is stored on the local filesystem or object storage (S3, GCS, Azure). Supports vector similarity search, full-text search, SQL filtering, and automatic versioning. Also available as a managed cloud service (LanceDB Cloud). Python package is in 'Alpha' status on PyPI despite being production-used. Backed by pyarrow and pylance (the Lance Rust library, not Microsoft's Python language server).
pip install lancedb Common errors
error ModuleNotFoundError: No module named 'lance.vector' ↓
cause This error occurs when the underlying `lance` or `pylance` dependency, which `lancedb` relies on for its core functionality, is not correctly installed or accessible in your Python environment, often due to dependency conflicts or specific `lancedb` versions.
fix
Ensure
lancedb and its dependencies are properly installed. It is recommended to use a fresh virtual environment and reinstall lancedb with pip install lancedb. If the issue persists, check your pylance installation. error ImportError: cannot import name 'LanceDb' from 'lancedb' ↓
cause Developers are attempting to import a class named 'LanceDb', which does not exist in the `lancedb` library's top-level module. The correct way to establish a database connection is by calling `lancedb.connect()`.
fix
Instead of
from lancedb import LanceDb, use import lancedb and then connect using db = lancedb.connect("path/to/db"). error AttributeError: 'pyarrow.lib.DataType' object has no attribute 'value_field' ↓
cause This `AttributeError` indicates an incompatibility between the installed version of `lancedb` and `pyarrow`. `lancedb` depends on specific `pyarrow` features, and if your `pyarrow` version is too old or too new, this error can arise.
fix
Upgrade both
lancedb and pyarrow to their latest compatible versions using pip install --upgrade lancedb pyarrow. If issues persist, refer to LanceDB's documentation for specific pyarrow version requirements. error RuntimeError: lance error: LanceError(Arrow): Arrow error: C Data interface error: Unknown error: 'pyarrow.lib.RecordBatch' object has no attribute 'set_column'. Detail: Python exception: AttributeError. ↓
cause This error happens when `lancedb` tries to use a method (`set_column`) on `pyarrow.lib.RecordBatch` that is not available in the installed `pyarrow` version, typically occurring with `pyarrow` versions older than 16.0.0.
fix
Upgrade your
pyarrow library to version 16.0.0 or newer by running pip install --upgrade "pyarrow>=16.0.0". Warnings
breaking Illegal instruction (SIGILL) crash on import on older Intel CPUs (pre-AVX2). lancedb/pylance wheels are compiled with AVX2 SIMD instructions. Affects Ubuntu 20.04 on older hardware and some VMs where CPU features are masked. ↓
fix Requires AVX2-capable CPU. Check with: grep avx2 /proc/cpuinfo. No workaround via pip — must use newer hardware or build from source without AVX2.
breaking Some lancedb releases have pinned a pre-release version of pylance as a hard dependency (e.g., lancedb==0.17.1 required pylance==0.21.0b5). This breaks pip/uv installs in strict environments that disallow pre-release packages. ↓
fix If a version fails to resolve, pin to the previous minor version or add --prerelease=allow to uv. Check GitHub releases for known bad versions.
breaking Python >=3.10 required as of lancedb 0.25+. Earlier Python versions raise install or import errors. ↓
fix Use Python 3.10, 3.11, 3.12, or 3.13.
gotcha PyPI status is 'Development Status :: 3 - Alpha' despite being widely used in production. The API has had breaking changes between minor versions. Pin lancedb to a specific version in production. ↓
fix Pin in requirements: lancedb==0.29.2. Review CHANGELOG before upgrading.
gotcha ANN index (create_index) must be created explicitly. Without it, all searches are O(n) brute-force regardless of dataset size. No warning is emitted — queries silently degrade at scale. ↓
fix Call table.create_index(metric='cosine') after loading data. For large datasets, tune num_partitions and num_sub_vectors for IVF_PQ.
gotcha Automatic versioning creates a new Lance snapshot on every write operation. On high-frequency write workloads this accumulates many small version files rapidly, increasing storage and compaction overhead. ↓
fix Periodically run table.compact_files() and table.cleanup_old_versions() to manage storage. This is not done automatically.
gotcha pylance (the LanceDB dependency) is a completely different package from pylance (Microsoft's Python language server for VS Code). pip install pylance without context installs Microsoft's package. lancedb's pylance is only installed as a transitive dependency. ↓
fix Never manually pip install pylance expecting lancedb's version. It is pulled in automatically by lancedb.
Install
pip install lancedb[embeddings] pip install lancedb[azure] pip install --pre --extra-index-url https://pypi.fury.io/lancedb/ lancedb Install compatibility stale last tested: 2026-05-12
python os / libc variant status wheel install import disk
3.10 alpine (musl) --pre - - - -
3.10 alpine (musl) lancedb - - - -
3.10 alpine (musl) azure - - - -
3.10 alpine (musl) embeddings - - - -
3.10 slim (glibc) --pre - - 3.85s 367M
3.10 slim (glibc) lancedb - - 3.90s 366M
3.10 slim (glibc) azure - - 5.08s 406M
3.10 slim (glibc) embeddings - - - -
3.11 alpine (musl) --pre - - - -
3.11 alpine (musl) lancedb - - - -
3.11 alpine (musl) azure - - - -
3.11 alpine (musl) embeddings - - - -
3.11 slim (glibc) --pre - - 5.77s 378M
3.11 slim (glibc) lancedb - - 5.78s 377M
3.11 slim (glibc) azure - - 7.28s 421M
3.11 slim (glibc) embeddings - - - -
3.12 alpine (musl) --pre - - - -
3.12 alpine (musl) lancedb - - - -
3.12 alpine (musl) azure - - - -
3.12 alpine (musl) embeddings - - - -
3.12 slim (glibc) --pre - - 5.50s 365M
3.12 slim (glibc) lancedb - - 5.37s 364M
3.12 slim (glibc) azure - - 7.24s 407M
3.12 slim (glibc) embeddings - - - -
3.13 alpine (musl) --pre - - - -
3.13 alpine (musl) lancedb - - - -
3.13 alpine (musl) azure - - - -
3.13 alpine (musl) embeddings - - - -
3.13 slim (glibc) --pre - - 5.20s 365M
3.13 slim (glibc) lancedb - - 5.40s 364M
3.13 slim (glibc) azure - - 6.91s 407M
3.13 slim (glibc) embeddings - - - -
3.9 alpine (musl) --pre - - - -
3.9 alpine (musl) lancedb - - - -
3.9 alpine (musl) azure - - - -
3.9 alpine (musl) embeddings - - - -
3.9 slim (glibc) --pre - - 4.41s 375M
3.9 slim (glibc) lancedb - - 4.44s 375M
3.9 slim (glibc) azure - - 5.61s 416M
3.9 slim (glibc) embeddings - - - -
Imports
- lancedb
import lancedb - LanceModel (schema) wrong
from pydantic import BaseModelcorrectfrom lancedb.pydantic import LanceModel, Vector
Quickstart stale last tested: 2026-05-12
import lancedb
import numpy as np
from lancedb.pydantic import LanceModel, Vector
# Connect (creates directory if not exists)
db = lancedb.connect("/tmp/my-lancedb")
# Define schema using LanceModel
class Item(LanceModel):
text: str
vector: Vector(128) # fixed dimensions
# Create table
table = db.create_table("items", schema=Item, mode="overwrite")
# Add data
data = [
Item(text="hello world", vector=np.random.rand(128).astype('float32'))
for _ in range(100)
]
table.add(data)
# Vector search (returns pandas DataFrame by default)
query_vec = np.random.rand(128).astype('float32')
results = table.search(query_vec).limit(5).to_pandas()
print(results)
# Create ANN index (required for scale)
table.create_index(metric="cosine") # IVF_PQ by default