{"id":853,"library":"databricks-sqlalchemy","title":"Databricks SQLAlchemy plugin for Python","description":"Databricks SQLAlchemy is a Python library that provides a SQLAlchemy dialect for connecting to Databricks SQL warehouses and clusters. It serves as a bridge between SQLAlchemy's ORM capabilities and the Databricks SQL Python driver, enabling Python applications to interact with Databricks data using standard SQLAlchemy patterns. The library is currently at version 2.0.9 and regularly receives updates, including recent support for complex data types like VARIANT, ARRAY, and MAP.","status":"active","version":"2.0.9","language":"python","source_language":"en","source_url":"https://github.com/databricks/databricks-sqlalchemy","tags":["database","databricks","sqlalchemy","orm","sql","analytics","data-warehouse"],"install":[{"cmd":"pip install databricks-sqlalchemy","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Core ORM and SQL toolkit that this library extends.","package":"sqlalchemy"},{"reason":"The underlying Python driver used by the dialect; SQLAlchemy features were moved from this connector into databricks-sqlalchemy in v4.0.0+ of the connector. It is crucial for functionality.","package":"databricks-sql-connector","optional":false},{"reason":"Optional dependency for enhanced performance with large data volumes, enabling features like CloudFetch in the underlying databricks-sql-connector.","package":"pyarrow","optional":true}],"imports":[{"symbol":"create_engine","correct":"from sqlalchemy import create_engine"},{"note":"The dialect is automatically registered with SQLAlchemy upon installation; no direct class import is typically needed. The 'databricks://' scheme is used in the connection URL.","symbol":"DatabricksDialect (implicit)","correct":"engine = create_engine('databricks://...')"},{"note":"For timezone-aware timestamps, as SQLAlchemy's default DateTime() is mapped to Databricks' timezone-agnostic TIMESTAMP_NTZ().","symbol":"TIMESTAMP","correct":"from databricks.sqlalchemy import TIMESTAMP"},{"note":"Explicitly import if you need the timezone-agnostic timestamp type.","symbol":"TIMESTAMP_NTZ","correct":"from databricks.sqlalchemy import TIMESTAMP_NTZ"},{"note":"Databricks-specific integer type.","symbol":"TINYINT","correct":"from databricks.sqlalchemy import TINYINT"}],"quickstart":{"code":"import os\nfrom sqlalchemy import create_engine, text\n\n# Ensure these environment variables are set:\n# DATABRICKS_SERVER_HOSTNAME: Hostname of your Databricks workspace or SQL warehouse\n# DATABRICKS_HTTP_PATH: HTTP path of your SQL warehouse or cluster\n# DATABRICKS_TOKEN: Your Databricks Personal Access Token\n# DATABRICKS_CATALOG (optional): Initial Unity Catalog catalog\n# DATABRICKS_SCHEMA (optional): Initial Unity Catalog schema\n\nhost = os.environ.get('DATABRICKS_SERVER_HOSTNAME', 'your_databricks_host.cloud.databricks.com')\nhttp_path = os.environ.get('DATABRICKS_HTTP_PATH', '/sql/1.0/endpoints/your_http_path')\naccess_token = os.environ.get('DATABRICKS_TOKEN', 'dapi...')\ncatalog = os.environ.get('DATABRICKS_CATALOG', 'default')\nschema = os.environ.get('DATABRICKS_SCHEMA', 'default')\n\ntry:\n    connection_string = (\n        f\"databricks://token:{access_token}@{host}?\"\n        f\"http_path={http_path}&catalog={catalog}&schema={schema}\"\n    )\n    engine = create_engine(connection_string)\n\n    with engine.connect() as connection:\n        # Example: Execute a simple query\n        result = connection.execute(text(\"SELECT 1 AS one, 'hello' AS two\"))\n        for row in result:\n            print(f\"Row: {row.one}, {row.two}\")\n\n        # Example with parameterized query (requires DBR 14.2+)\n        # Note: Databricks dialect uses 'named' paramstyle ':param'\n        param_value = \"world\"\n        result = connection.execute(text(\"SELECT :val AS greeting\"), {'val': param_value})\n        for row in result:\n            print(f\"Greeting: {row.greeting}\")\n\n    print(\"Successfully connected and executed queries.\")\nexcept Exception as e:\n    print(f\"Error connecting to Databricks: {e}\")\n    print(\"Please ensure environment variables are correctly set and Databricks resources are accessible.\")","lang":"python","description":"This quickstart demonstrates how to establish a connection to Databricks using the `databricks-sqlalchemy` dialect and `create_engine`. It utilizes environment variables for secure authentication and connection details, then executes a simple SELECT query. It also includes an example of a parameterized query, highlighting the named paramstyle required by the dialect."},"warnings":[{"fix":"Ensure `databricks-sqlalchemy` and `SQLAlchemy` major versions are compatible. For SQLAlchemy 2.x, use `pip install databricks-sqlalchemy`. For SQLAlchemy 1.x, use `pip install databricks-sqlalchemy~=1.0`.","message":"The `databricks-sqlalchemy` library has distinct major versions for SQLAlchemy 1.x and SQLAlchemy 2.x. Version `1.x` of `databricks-sqlalchemy` supports SQLAlchemy 1.x, while `2.x` supports SQLAlchemy 2.x. Upgrading `databricks-sqlalchemy` to `2.x` requires a corresponding upgrade of `SQLAlchemy` to `2.x`, and vice-versa, to ensure compatibility.","severity":"breaking","affected_versions":"All versions (specifically, the distinction between 1.x and 2.x branches)"},{"fix":"Verify your `pip install` command is `pip install databricks-sqlalchemy` and check your `requirements.txt` to avoid `sqlalchemy-databricks`.","message":"There are two similarly named, but distinct, Python packages: `databricks-sqlalchemy` (the official dialect, this package) and `sqlalchemy-databricks` (an older, separate community-driven package). Always ensure you install and use `databricks-sqlalchemy` for official and up-to-date support.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure your Databricks Runtime is 14.2 or newer, and use the `:param_name` syntax for parameters in your SQL queries when executing via SQLAlchemy.","message":"The SQLAlchemy 2.0 dialect for Databricks *always* uses native parameterized queries with the 'named' paramstyle (`:param`) and requires Databricks Runtime 14.2 or above. Using a different paramstyle or older DBR versions will result in query failures.","severity":"gotcha","affected_versions":"2.0.0 and later"},{"fix":"Use `databricks.sqlalchemy.TIMESTAMP()` for timezone-aware columns. For timezone-agnostic columns, `sqlalchemy.DateTime()` will correctly map to `TIMESTAMP_NTZ()`.","message":"SQLAlchemy's `DateTime()` type is mapped to Databricks' timezone-agnostic `TIMESTAMP_NTZ()`. If you declare `DateTime(timezone=True)`, the `timezone` argument will be ignored by the Databricks dialect. For timezone-aware timestamps, you must explicitly import and use `databricks.sqlalchemy.TIMESTAMP()`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Upgrade to `databricks-sqlalchemy` version 2.x to resolve identifier quoting issues in Unity Catalog. Always quote identifiers in your raw SQL if they contain special characters.","message":"Older versions of `databricks-sqlalchemy` (e.g., 1.0.5) had issues with unquoted identifiers in Unity Catalog, particularly for catalog or schema names containing hyphens. This could lead to `[INVALID_IDENTIFIER]` errors during metadata operations.","severity":"gotcha","affected_versions":"<2.0.0"},{"fix":"For auto-incrementing fields, define them as `Column(BigInteger, Identity())`.","message":"The `autoincrement` argument on SQLAlchemy type declarations (e.g., `Integer(autoincrement=True)`) is currently ignored by the Databricks dialect. To create an auto-incrementing field, you must explicitly use `sqlalchemy.Identity()` and combine it with the `databricks.sqlalchemy.BigInteger()` type, as only BIGINT fields support auto-increment in Databricks Runtime.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure your environment includes necessary build tools. For Alpine Linux, install `build-base` and `python3-dev` using `apk add build-base python3-dev`. For Debian/Ubuntu, use `apt-get install build-essential python3-dev`.","message":"Installation in minimal environments (e.g., Alpine Linux) may fail for packages with C extensions if essential build tools are missing. Many Python packages, including some dependencies of `databricks-sqlalchemy`, require a C compiler and Python development headers to be built from source during installation.","severity":"breaking","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-05-12T20:24:39.515Z","next_check":"2026-06-27T00:00:00.000Z","problems":[{"fix":"Ensure `databricks-sqlalchemy` is installed in your environment: `pip install databricks-sqlalchemy`. Also, verify that SQLAlchemy itself is installed and up-to-date.","cause":"This error occurs when the SQLAlchemy library cannot find or load the Databricks dialect because the `databricks-sqlalchemy` package is not correctly installed or registered with SQLAlchemy.","error":"sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:databricks"},{"fix":"Install the library using pip: `pip install databricks-sqlalchemy`. If `databricks-sql-connector` is missing, install it explicitly: `pip install databricks-sql-connector`.","cause":"This typically means that the `databricks-sqlalchemy` package or its core dependency, `databricks-sql-connector`, has not been installed in the Python environment.","error":"ModuleNotFoundError: No module named 'databricks'"},{"fix":"Upgrade SQLAlchemy to version 2.0 or higher: `pip install --upgrade SQLAlchemy` or specifically to a compatible version that includes the `Uuid` type if you are using an older version of SQLAlchemy or related libraries that require it.","cause":"This error usually indicates a version incompatibility between SQLAlchemy and a dependent library (like `langchain`) or between `databricks-sqlalchemy` and the installed SQLAlchemy version, where the 'Uuid' type is expected but not found in the current SQLAlchemy version.","error":"AttributeError: module 'sqlalchemy.types' has no attribute 'Uuid'"},{"fix":"When using `pandas.to_sql`, avoid setting `method='multi'` or break down the DataFrame into smaller chunks before inserting to stay within the parameter limit. Alternatively, try inserting row-by-row, though this might be slower.","cause":"This specific `OperationalError` occurs when attempting to insert too many rows or columns at once using methods like `pandas.to_sql` with `method='multi'`, exceeding the maximum parameter limit enforced by the Databricks SQL endpoint.","error":"sqlalchemy.exc.OperationalError: (databricks.sql.exc.RequestError) Error during request to server: BAD_REQUEST: Parameterized query has too many parameters: 2000 parameters were given but the limit is 256."}],"ecosystem":"pypi","meta_description":null,"install_score":50,"install_tag":"draft","quickstart_score":null,"quickstart_tag":null,"pypi_latest":"2.0.9","cli_name":"","install_checks":{"last_tested":"2026-05-12","tag":"draft","tag_description":"notable install failures or slow imports","installed_version":"2.0.9","pypi_latest":"2.0.9","is_stale":false,"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"databricks-sqlalchemy","exit_code":1,"wheel_type":null,"failure_reason":"build_error","import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"databricks-sqlalchemy","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"databricks-sqlalchemy","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":13.5,"import_time_s":0.35,"mem_mb":15.5,"disk_size":"350M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"databricks-sqlalchemy","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.36,"mem_mb":15.5,"disk_size":"346M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"databricks-sqlalchemy","exit_code":1,"wheel_type":null,"failure_reason":"build_error","import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"databricks-sqlalchemy","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"databricks-sqlalchemy","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":12.9,"import_time_s":0.72,"mem_mb":17.6,"disk_size":"379M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"databricks-sqlalchemy","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.68,"mem_mb":17.6,"disk_size":"374M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"databricks-sqlalchemy","exit_code":1,"wheel_type":null,"failure_reason":"build_error","import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"databricks-sqlalchemy","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"databricks-sqlalchemy","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":14.2,"import_time_s":0.81,"mem_mb":17.4,"disk_size":"360M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"databricks-sqlalchemy","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.99,"mem_mb":17.4,"disk_size":"356M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"databricks-sqlalchemy","exit_code":1,"wheel_type":null,"failure_reason":"build_error","import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"databricks-sqlalchemy","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"databricks-sqlalchemy","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":14.1,"import_time_s":0.78,"mem_mb":17.6,"disk_size":"359M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"databricks-sqlalchemy","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.84,"mem_mb":17.6,"disk_size":"355M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"databricks-sqlalchemy","exit_code":1,"wheel_type":null,"failure_reason":"build_error","import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"databricks-sqlalchemy","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"databricks-sqlalchemy","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":15.9,"import_time_s":0.41,"mem_mb":15.1,"disk_size":"347M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"databricks-sqlalchemy","exit_code":0,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":0.43,"mem_mb":15.1,"disk_size":"347M"}]},"quickstart_checks":{"last_tested":"2026-04-24","tag":null,"tag_description":null,"results":[{"runtime":"python:3.10-alpine","exit_code":1},{"runtime":"python:3.10-slim","exit_code":-1},{"runtime":"python:3.11-alpine","exit_code":1},{"runtime":"python:3.11-slim","exit_code":-1},{"runtime":"python:3.12-alpine","exit_code":1},{"runtime":"python:3.12-slim","exit_code":-1},{"runtime":"python:3.13-alpine","exit_code":1},{"runtime":"python:3.13-slim","exit_code":-1},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":-1}]}}