SQLAlchemy Dremio
SQLAlchemy Dremio is a SQLAlchemy dialect that enables connecting to Dremio via its Apache Arrow Flight interface. It's actively maintained, with the current stable version being 3.0.5, and generally follows a release cadence tied to Dremio Flight client updates and SQLAlchemy compatibility.
Common errors
-
sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:dremio
cause `sqlalchemy-dremio` library is not installed or accessible in the current Python environment.fixRun `pip install sqlalchemy-dremio` to install the dialect. -
ModuleNotFoundError: No module named 'dremio_flight_client'
cause The core dependency `dremio-flight-client` is missing. This is required for establishing Flight connections.fixRun `pip install dremio-flight-client` or simply `pip install sqlalchemy-dremio` (which should pull it as a dependency). -
pyarrow.lib.ArrowIOError: Failed to connect to ... Connection refused
cause The Dremio Flight server is not reachable at the specified host and port, or a firewall is blocking the connection. This can also indicate incorrect port or a missing SSL flag if the server expects SSL.fixVerify that Dremio is running, the Flight endpoint is enabled, the host and port (commonly 32010) are correct, and network firewalls allow the connection. Also, check if `USE_SSL=true` is needed in the connection string. -
pyarrow.lib.ArrowIOError: UNAUTHENTICATED: Invalid credentials
cause The username or password provided in the connection string is incorrect for the Dremio instance.fixDouble-check the username and password in your connection string or environment variables. Ensure the user has permissions to access Dremio via Flight. -
sqlalchemy.exc.ProgrammingError: (dremio_flight_client.client.DremioFlightClientException) RESOURCE_EXHAUSTED: Schema/Table '...' not found
cause The SQL query references a table or schema that does not exist or is not accessible, often due to an incorrect default catalog in the connection string or an incorrect path in the query.fixEnsure the catalog specified in the connection string (e.g., `/DREMIO`) is correct. Also, verify the table/schema name in your query and ensure it's fully qualified if necessary (e.g., `"Dremio"."myspace"."mytable"`).
Warnings
- breaking Version 3.x introduced a breaking change by switching from the deprecated `PyDremio` library to the official `dremio-flight-client` and updating the connection string scheme. Older versions used `dremio://`, while 3.x+ requires `dremio+flight://`.
- gotcha Dremio's Flight endpoint often requires SSL/TLS by default. Failing to include `USE_SSL=true` in your connection string (or explicitly setting it to false if the server doesn't use SSL) can lead to connection failures.
- gotcha The connection string must include a catalog (e.g., `/DREMIO`, `/your_space`) after the host and port. Omitting it or providing an incorrect one can lead to 'Table not found' or 'Schema not found' errors, as queries might not resolve to the correct Dremio context.
- gotcha Dremio Flight often runs on a different port than the UI or ODBC/JDBC connections (commonly 32010). Using the wrong port (e.g., 9047 for UI) will result in connection refused errors.
Install
-
pip install sqlalchemy-dremio -
pip install sqlalchemy-dremio[pandas]
Quickstart
import sqlalchemy
import os
# Configure Dremio connection details using environment variables for security
DREMIO_HOST = os.environ.get('DREMIO_HOST', 'localhost')
DREMIO_PORT = os.environ.get('DREMIO_PORT', '32010')
DREMIO_USER = os.environ.get('DREMIO_USER', 'dremio_user')
DREMIO_PASS = os.environ.get('DREMIO_PASS', 'dremio_password')
DREMIO_CATALOG = os.environ.get('DREMIO_CATALOG', 'DREMIO') # e.g., 'DREMIO' or your space name
USE_SSL = os.environ.get('DREMIO_USE_SSL', 'true').lower() == 'true'
# Construct the connection string
connection_string = (
f"dremio+flight://{DREMIO_USER}:{DREMIO_PASS}@{DREMIO_HOST}:{DREMIO_PORT}/"
f"{DREMIO_CATALOG}?USE_SSL={str(USE_SSL).lower()}"
)
print(f"Connecting to: {connection_string.split(DREMIO_PASS)[0]}*****@{DREMIO_HOST}:{DREMIO_PORT}/{DREMIO_CATALOG}...")
try:
# Create the SQLAlchemy engine
engine = sqlalchemy.create_engine(connection_string)
# Establish a connection and execute a simple query
with engine.connect() as connection:
# Ensure you have a table accessible, e.g., 'sys.version'
result = connection.execute(sqlalchemy.text("SELECT * FROM sys.version"))
# Fetch and print results
print("\nQuery Results:")
for row in result:
print(row)
print("\nSuccessfully connected to Dremio and executed a query.")
except Exception as e:
print(f"\nError connecting to Dremio or executing query: {e}")
print("Please ensure Dremio is running, credentials are correct, ")
print("and the Flight endpoint is accessible (often port 32010, sometimes with SSL).")