Apache Airflow ODBC Provider

raw JSON →
4.12.1 verified Sat Apr 25 auth: no python

This provider package enables Apache Airflow to connect to various ODBC data sources, including MS SQL Server, to execute queries and perform database operations. It is released independently from Airflow core and follows semantic versioning, with major version upgrades indicating breaking changes.

pip install apache-airflow-providers-odbc
error ModuleNotFoundError: No module named 'airflow.providers.odbc'
cause The 'apache-airflow-providers-odbc' package is not installed.
fix
Install the package using 'pip install apache-airflow-providers-odbc'.
error ImportError: cannot import name 'OdbcHook' from 'airflow.providers.odbc.hooks.odbc'
cause The 'OdbcHook' class is not available in the specified module, possibly due to an incorrect import path or missing package.
fix
Ensure that 'apache-airflow-providers-odbc' is installed and use the correct import statement: 'from airflow.providers.odbc.hooks.odbc import OdbcHook'.
error pyodbc.Error: ('IM002', '[IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified (0) (SQLDriverConnect)')
cause The ODBC driver is not installed or the data source name (DSN) is not configured correctly.
fix
Install the appropriate ODBC driver for your database and ensure the DSN is correctly configured.
error AttributeError: module 'pyodbc' has no attribute 'connect'
cause The 'pyodbc' module is not installed or is not accessible.
fix
Install the 'pyodbc' module using 'pip install pyodbc'.
error OperationalError: ('08001', '[08001] [Microsoft][ODBC Driver 17 for SQL Server]TCP Provider: Error code 0x2746 (10060) (SQLDriverConnect)')
cause The connection to the SQL Server timed out, possibly due to network issues or incorrect server address.
fix
Verify the server address, network connectivity, and that the SQL Server is configured to accept remote connections.
breaking The `driver` parameter for `OdbcHook` must now be passed directly via the hook constructor or `hook_params` in SQL Operators, not through the connection's `extra` field, due to a security vulnerability fix.
fix Update usage to pass `driver` as a keyword argument to `OdbcHook` or within the `hook_params` dictionary for SQL Operators. Example `extra` for connection: `{"autocommit": true}` and `driver='{ODBC Driver 18 for SQL Server}'` passed separately to the hook.
breaking This provider version has increased minimum Airflow requirements. For example, provider `3.0.0` requires Airflow `2.2+` and provider `4.12.1` requires Airflow `2.11.0+`. Older provider versions had lower Airflow requirements.
fix Ensure your Airflow core installation meets the minimum version requirement for the installed provider. Refer to the provider's changelog for specific version compatibility.
breaking When passing keyword arguments to the ODBC connection via the `connect_kwargs` key in the connection's `extra` field (e.g., `autocommit`, `ansi`), values must now be booleans (e.g., `true`, `false`) instead of strings (`"true"`, `"false"`).
fix Update your Airflow Connection's `extra` field to use boolean literals for `connect_kwargs` values: `"connect_kwargs": {"autocommit": true, "ansi": false}`.
gotcha The `apache-airflow-providers-odbc` package and its `pyodbc` dependency require system-level ODBC drivers to be installed on your Airflow worker machines. These are not installed by `pip`.
fix Manually install the appropriate ODBC driver for your database (e.g., Microsoft ODBC Driver for SQL Server) on the operating system where Airflow workers run.
gotcha Enabling the `allow_driver_in_extra` configuration (via `AIRFLOW__PROVIDERS_ODBC__ALLOW_DRIVER_IN_EXTRA=true` environment variable or `airflow.cfg`) is a security risk. It allows users to specify arbitrary ODBC drivers via the Airflow Connection's `extra` field, potentially leading to privilege escalation or arbitrary code execution if untrusted users can modify connections.
fix Avoid setting `allow_driver_in_extra` to `True` unless you fully trust all users who can modify Airflow connections. Ensure drivers are only specified directly in DAG code or via the hook constructor.
gotcha Some users have reported `UnicodeDecodeError` (e.g., `'utf-8' codec can't decode byte 0x91`) when using specific ODBC drivers (like Netezza) with certain Airflow and provider versions.
fix If encountering encoding errors, ensure the correct character set is configured in your ODBC driver and/or connection string. Consult the specific database and ODBC driver documentation, or consider upgrading/downgrading provider/Airflow versions if a known fix exists.
pip install apache-airflow[odbc]
runtime variant status import time mem disk
3.10-alpine default
3.10-alpine odbc
3.10-slim default
3.10-slim odbc
3.11-alpine default
3.11-alpine odbc
3.11-slim default
3.11-slim odbc
3.12-alpine default
3.12-alpine odbc
3.12-slim default
3.12-slim odbc
3.13-alpine default
3.13-alpine odbc
3.13-slim default
3.13-slim odbc
3.9-alpine default
3.9-alpine odbc
3.9-slim default
3.9-slim odbc

This quickstart demonstrates a basic Airflow DAG using the `SQLExecuteQueryOperator` to interact with an ODBC database. It assumes an Airflow ODBC connection (`my_odbc_conn`) is configured in the UI or via environment variables, including the necessary ODBC driver. Remember to install system-level ODBC drivers for your specific database.

from __future__ import annotations

import pendulum

from airflow.models.dag import DAG
from airflow.providers.common.sql.operators.sql import SQLExecuteQueryOperator

# Ensure you have an Airflow Connection named 'my_odbc_conn'
# type: ODBC
# host: your_odbc_server
# login: your_username
# password: your_password
# extra: {"driver": "{ODBC Driver 18 for SQL Server}", "autocommit": true}

with DAG(
    dag_id="odbc_example_dag",
    start_date=pendulum.datetime(2023, 1, 1, tz="UTC"),
    schedule=None,
    catchup=False,
    tags=["odbc", "example"],
) as dag:
    create_table = SQLExecuteQueryOperator(
        task_id="create_table",
        sql="""
            CREATE TABLE IF NOT EXISTS my_odbc_table (
                id INT IDENTITY(1,1) PRIMARY KEY,
                value VARCHAR(255)
            );
        """,
        conn_id="my_odbc_conn",
    )

    insert_data = SQLExecuteQueryOperator(
        task_id="insert_data",
        sql="""
            INSERT INTO my_odbc_table (value) VALUES ('test_value_{{ ds }}');
        """,
        conn_id="my_odbc_conn",
    )

    # Note: For fetching data, you'd typically use OdbcHook in a PythonOperator
    # or a custom operator, as SQLExecuteQueryOperator is for execution.

    create_table >> insert_data