ArcticDB DataFrame Database
ArcticDB is a high-performance, serverless DataFrame database engine designed for the Python Data Science ecosystem. It enables storing, retrieving, and processing Pandas DataFrames at scale, backed by commodity object storage like S3-compatible services, Azure Blob Storage, and local LMDB. It provides a Python API backed by a fast C++ data-processing and compression engine, supporting flexible schemas and bitemporal versioning. The library is actively maintained with frequent patch and minor releases.
Common errors
-
Normalization Error 2000: Attempting to update or append an existing type with an incompatible object type
cause You are trying to update or append data to an existing symbol, but the new data (NumPy array or Pandas DataFrame) has a different type than the original data stored in the symbol.fixRead the latest version of the symbol and ensure your update/append operation uses data of the corresponding type. If you need to change the schema, the library must be created with `dynamic_schema=True`. -
LMDB error code -30792
cause The LMDB map size limit has been reached, meaning your local database file is full and cannot store more data.fixIncrease the map size when initializing `Arctic` for LMDB by passing a larger `map_size` in the URI or options. Example: `adb.Arctic('lmdb:///path/to/db?map_size=100GB')`. -
Normalization Error 2003: A write of an incompatible index type has been attempted.
cause ArcticDB only supports specific Pandas index types for its on-disk index (e.g., `DatetimeIndex` for time-series, or a `RowCount` integer index). You attempted to write data with an unsupported index type.fixEnsure your Pandas DataFrame uses a supported index type, such as `DatetimeIndex` for time-series data. Non-DatetimeIndex will be converted to a `RowCount` integer index. -
ArcticException: A missing key has been requested. [...] The symbol being worked on does not exist.
cause You are trying to read, update, or append to a symbol that does not exist in the specified library.fixVerify the symbol name is correct. Use `library.list_symbols()` to check existing symbols. Ensure the symbol has been written at least once before attempting read/modify operations.
Warnings
- gotcha When using `update` or `append` operations, ArcticDB enforces a static schema by default. If you intend to modify the schema (add/remove/change column types), you must create the library with `dynamic_schema=True`.
- gotcha For LMDB storage, ensure only one Arctic instance is open per LMDB database within a single Python process. LMDB also does not support remote filesystems.
- gotcha Production use of ArcticDB (including business or commercial environments) or its use as a Database Service requires a paid license from ArcticDB Limited.
- gotcha When connecting to Azure Blob Storage, if the `CA_cert_path` is incorrect or the certificate cannot be found, Azure exceptions may lack meaningful error codes, making debugging difficult.
- deprecated The `QueryBuilder` API is officially deprecated in favor of the more intuitive and recommended `LazyDataFrame` API for complex querying, filtering, group-bys, and aggregations.
Install
-
pip install arcticdb
Imports
- Arctic
from arcticdb import Arctic
- adb
import arcticdb as adb
Quickstart
import arcticdb as adb
import pandas as pd
import numpy as np
import os
# Use a temporary LMDB directory for local storage
uri = f"lmdb://{os.environ.get('ARCTICDB_PATH', '/tmp/arcticdb_quickstart')}"
ac = adb.Arctic(uri)
# Create a library
library_name = 'my_test_library'
if not ac.library_exists(library_name):
ac.create_library(library_name)
lib = ac.get_library(library_name)
# Create a sample DataFrame
df = pd.DataFrame(np.random.randint(0, 100, size=(10, 3)), columns=list('ABC'))
df.index = pd.date_range('2023-01-01', periods=10, freq='D')
# Write the DataFrame
symbol_name = 'my_test_symbol'
lib.write(symbol_name, df)
print(f"DataFrame written to {library_name}/{symbol_name}")
# Read the DataFrame back
read_df = lib.read(symbol_name).data
print("Read DataFrame head:")
print(read_df.head())
# Clean up (optional for temporary storage)
# ac.delete_library(library_name)