StarRocks SQLAlchemy Dialect
The `starrocks` library provides a Python SQLAlchemy Dialect and an Alembic extension for interacting with StarRocks, a next-generation data platform designed for fast, real-time analytics. It enables developers to leverage SQLAlchemy's ORM and expression language, and manage database schema migrations with Alembic. The project is actively maintained with releases occurring every few months.
Warnings
- breaking The `starrocks` Python client currently supports SQLAlchemy versions `>=1.4` but strictly less than `2.0`. Attempting to use `starrocks` with SQLAlchemy 2.0 (e.g., `sqlalchemy>=2.0.0`) will result in incompatibility issues.
- gotcha For basic synchronous StarRocks connections, an underlying MySQL-compatible DBAPI driver (like `mysqlclient` or `PyMySQL`) might be required, even if not explicitly listed as a direct dependency of the `starrocks` package. If you encounter errors related to `MySQLdb` or similar, install one of these drivers.
- gotcha The library officially supports Python versions `3.10` through `3.14`. Using it with Python `3.15` or newer may lead to unexpected behavior or incompatibilities.
- breaking When upgrading StarRocks *server* versions, be aware of strict downgrade limitations. For instance, downgrading from StarRocks 4.1 to any 4.0 version below 4.0.6 is not supported due to internal changes in data layout. Similar restrictions apply for downgrading from 4.0 to versions below 3.5.2.
- gotcha Connecting to and managing external catalogs (e.g., Iceberg) via the SQLAlchemy dialect might have limitations or require specific connection string formats. An open GitHub issue indicates challenges with specifying catalogs directly and reflecting tables.
Install
-
pip install starrocks
Imports
- create_engine
from sqlalchemy import create_engine, text
- StarRocks-specific types
from starrocks import INTEGER, STRING, ARRAY, ...
Quickstart
import os
from sqlalchemy import create_engine, text
# Configure connection details via environment variables
STARROCKS_USER = os.environ.get('STARROCKS_USER', 'root')
STARROCKS_PASSWORD = os.environ.get('STARROCKS_PASSWORD', '')
STARROCKS_HOST = os.environ.get('STARROCKS_HOST', 'localhost')
STARROCKS_PORT = os.environ.get('STARROCKS_PORT', '9030')
STARROCKS_DATABASE = os.environ.get('STARROCKS_DATABASE', 'mydatabase')
# Construct connection string
connection_string = (
f"starrocks://{STARROCKS_USER}:" # User and optional password
f"{STARROCKS_PASSWORD}@{STARROCKS_HOST}:{STARROCKS_PORT}/"
f"{STARROCKS_DATABASE}"
)
# Create a SQLAlchemy engine
engine = create_engine(connection_string)
try:
# Establish a connection and execute a basic query
with engine.connect() as connection:
print("Connection successful!")
# Ensure 'mytable' exists in 'mydatabase' for this example
result = connection.execute(text("SELECT 1 + 1")).scalar()
print(f"Query result: {result}")
# Example: Fetching data from a table (uncomment and replace if 'mytable' exists)
# rows = connection.execute(text("SELECT * FROM mytable LIMIT 2")).fetchall()
# print(rows)
except Exception as e:
print(f"An error occurred: {e}")