pyexasol Exasol Python Driver
pyexasol is the officially supported Python connector for Exasol, designed for high-performance data handling with low overhead, fast HTTP transport, and compression. It provides an API for parallel data stream processing, offering significant performance improvements over ODBC/JDBC solutions, especially with `pandas`, `parquet`, and `polars`. The library is actively maintained with a regular release cadence, often seeing monthly or bi-monthly updates.
Warnings
- breaking Python 3.9 support was dropped in PyExasol v2.0.0. Projects using Python 3.9 or older must upgrade their Python version to >=3.10 to use PyExasol v2.0.0 or later.
- breaking The `export_params['with_column_names']` parameter for export functions (e.g., `export_to_pandas`) now strictly requires a boolean value. Prior to v2.0.0, its mere presence would be interpreted as `True` regardless of the assigned value.
- breaking From PyExasol v1.0.0, strict certificate verification became the default behavior for `pyexasol.connect()` and `ExaConnection`. This changes the default `websocket_sslopt=None` from effectively `{'cert_reqs': ssl.CERT_NONE}` to `{'cert_reqs': ssl.CERT_REQUIRED}`, potentially causing `SSL: CERTIFICATE_VERIFY_FAILED` errors for users not explicitly configuring SSL options. Version 1.0.1 introduced a partial mitigation for fingerprint users.
- gotcha When using `export_to_parquet`, the destination directory (dst) by default must be empty or not exist. If it exists and contains files, an exception may be raised unless `callback_params['existing_data_behavior']` is set to 'overwrite_or_ignore' or 'delete_matching'.
- gotcha When exporting data to pandas using `export_to_pandas`, data types might not be perfectly preserved due to differences in Exasol and pandas/NumPy type systems (e.g., Exasol's exact decimals or large integers may map to floats or objects in pandas, potentially losing precision). The underlying mechanism uses CSV export/import.
- gotcha PyExasol v2.1.0 improved error reporting for `import_from_callback` and `export_to_callback` by wrapping exceptions from various internal threads. Earlier versions might have provided less clear error messages during data transfer failures, making debugging harder.
Install
-
pip install pyexasol -
pip install pyexasol[pandas,pyarrow,polars,orjson]
Imports
- ExaConnection
from pyexasol import ExaConnection
Quickstart
import os
from pyexasol import ExaConnection
# Replace with your Exasol connection details or environment variables
EXASOL_HOST = os.environ.get('EXASOL_HOST', '127.0.0.1')
EXASOL_PORT = os.environ.get('EXASOL_PORT', '8563')
EXASOL_USER = os.environ.get('EXASOL_USER', 'sys')
EXASOL_PASSWORD = os.environ.get('EXASOL_PASSWORD', 'exasol')
try:
# Connect to the Exasol database
# For production, ensure proper SSL options and fingerprint are used
# For local/testing without certs, add: websocket_sslopt={'check_hostname': False, 'verify_mode': 0}
# See warnings for v1.0.0 regarding strict certificate verification defaults.
con = ExaConnection(
dsn=f"{EXASOL_HOST}:{EXASOL_PORT}",
user=EXASOL_USER,
password=EXASOL_PASSWORD
)
print("Successfully connected to Exasol.")
# Execute a DDL statement
con.execute("CREATE SCHEMA IF NOT EXISTS MY_SCHEMA")
con.execute("OPEN SCHEMA MY_SCHEMA")
# Execute a DML statement
con.execute("CREATE OR REPLACE TABLE my_table (id INT, name VARCHAR(100))")
con.execute("INSERT INTO my_table VALUES (1, 'Alice'), (2, 'Bob')")
# Fetch data
stmt = con.execute("SELECT * FROM my_table ORDER BY id")
results = stmt.fetchall()
print("Fetched results:", results)
# Fetch data into a pandas DataFrame (requires `pyexasol[pandas]`)
# import pandas as pd
# df = con.export_to_pandas("SELECT * FROM my_table")
# print("Fetched into DataFrame:\n", df)
except Exception as e:
print(f"An error occurred: {e}")
finally:
if 'con' in locals() and con.is_connected():
con.close()
print("Connection closed.")