TileDB Python API
TileDB-Py is the official Python interface to the TileDB Storage Engine, an efficient multi-dimensional array management system. It provides a Pythonic API for storing and accessing dense and sparse array data, featuring fast updates, reads, excellent compression, and efficient parallel I/O with high scalability. The library is actively maintained, with version 0.36.1 released on February 25, 2026, and regular updates.
Common errors
-
TileDBError: Error: Internal TileDB uncaught exception; std::bad_alloc
cause This error typically indicates an out-of-memory condition when attempting to read or process large TileDB arrays, particularly when loading entire arrays into memory or during consolidation operations without sufficient resources.fixFor reads, try reading data in smaller chunks or slices rather than the entire array at once. For writes or consolidations, ensure adequate system memory (RAM) is available or optimize array schema parameters (e.g., tile capacity, compression) to reduce memory footprint. If processing very large datasets, consider distributed computing frameworks or memory-mapping options if applicable. -
TileDBError: [TileDB::Array] Error: Cannot open array; Array does not exist
cause This error most commonly occurs when the provided URI for a TileDB array does not point to an existing array. This can be due to a typo in the URI, incorrect file paths, or insufficient permissions. When using cloud storage, it often relates to authentication/authorization issues or incorrect bucket/path specifications.fixDouble-check the array URI for correctness, including any file system paths or cloud storage bucket/object prefixes. Verify that the current user/process has read/write permissions to the specified location. For cloud storage, ensure authentication (e.g., AWS credentials, GCP service account) is correctly configured and accessible to the TileDB process. -
TypeError: 'Dimension' object is not subscriptable
cause This error indicates an attempt to treat a `tiledb.Dim` object as if it were a sequence or dictionary (e.g., trying to index into it with `[]`), rather than accessing its properties directly.fixAccess properties of `tiledb.Dim` using dot notation (e.g., `dim.name`, `dim.domain`, `dim.dtype`) instead of attempting to index into the object directly. The `tiledb.Domain` object, which contains dimensions, is subscriptable to access dimensions by index or name.
Warnings
- breaking TileDB-Py 0.36.1 and later versions restrict the compatible `pandas` library to versions below 3 (`pandas < 3`). Users attempting to install or run with `pandas` 3.0 or higher may encounter dependency resolution issues or unexpected behavior.
- gotcha Initial installation via `pip install tiledb` can take a significant amount of time, as the package automatically downloads and builds the native TileDB C++ library along with Python bindings. If `numpy` and `cython` are not pre-installed, `pip` may also build them from source, further increasing install time.
- gotcha When working with cloud storage (e.g., S3, GCS), incorrect configuration of credentials or environment variables can lead to `TileDBError: [TileDB::Array] Error: Cannot open array; Array does not exist`, even if the URI format is correct and local operations succeed.
Install
-
pip install tiledb
Imports
- tiledb
import tiledb
- libtiledb
import tiledb.libtiledb
Quickstart
import tiledb
import numpy as np
import os
# Define array URI
array_uri = "my_dense_array"
# Clean up previous array if it exists
if os.path.exists(array_uri):
tiledb.remove(array_uri)
# 1. Create a dense array schema
dom = tiledb.Domain(
tiledb.Dim(name="rows", domain=(1, 4), tile=4, dtype=np.int32),
tiledb.Dim(name="cols", domain=(1, 4), tile=4, dtype=np.int32)
)
attr = tiledb.Attr(name="data", dtype=np.int32)
schema = tiledb.ArraySchema(domain=dom, attrs=[attr], sparse=False)
# 2. Create the array
tiledb.DenseArray.create(array_uri, schema)
# 3. Write data to the array
with tiledb.DenseArray(array_uri, mode="w") as A:
data = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
A[:] = data
# 4. Read data from the array (slice)
with tiledb.DenseArray(array_uri, mode="r") as A:
# Read a slice (e.g., rows 1-2, cols 2-3)
subset = A[1:3, 2:4]
print("Read subset:")
print(subset)
# Clean up
tiledb.remove(array_uri)