Apache Airflow MSSQL Provider
The Apache Airflow Microsoft MSSQL Provider enables seamless interaction with Microsoft SQL Server databases within Airflow DAGs. It provides hooks and operators for connecting to, querying, and managing data in MSSQL. This provider is part of the larger Apache Airflow ecosystem, currently at version 4.5.1, and receives updates aligned with Airflow's release cycle, as well as independent bug fixes and feature enhancements.
Warnings
- breaking The `apache-airflow-providers-microsoft-mssql` provider now requires Python 3.10 or higher. Users running Airflow on older Python versions (e.g., 3.8, 3.9) must upgrade their Python environment to use provider version 4.0.0 and above.
- gotcha This provider itself does not include the actual database drivers (`pymssql` or `pyodbc`). You must install one of these Python packages separately and ensure any necessary system-level ODBC drivers (for `pyodbc`) are also installed on your Airflow worker host.
- gotcha Connection string issues are common. Especially with `pyodbc`, specific driver names and connection parameters (e.g., trust server certificate) often need to be passed in the 'Extra' field of the Airflow connection to ensure successful negotiation with the SQL Server.
Install
-
pip install apache-airflow-providers-microsoft-mssql
Imports
- MsSqlHook
from airflow.providers.microsoft.mssql.hooks.mssql import MsSqlHook
- MsSqlOperator
from airflow.providers.microsoft.mssql.operators.mssql import MsSqlOperator
- MsSqlToS3Operator
from airflow.providers.microsoft.mssql.transfers.mssql_to_s3 import MsSqlToS3Operator
Quickstart
from __future__ import annotations
import pendulum
from airflow.models.dag import DAG
from airflow.providers.microsoft.mssql.operators.mssql import MsSqlOperator
# Configure an Airflow Connection named 'mssql_default'
# Host: your_mssql_host
# Port: 1433 (default)
# Schema: your_database_name
# Login: your_username
# Password: your_password
# Extra: {"driver": "ODBC Driver 17 for SQL Server"} (if using pyodbc)
with DAG(
dag_id="mssql_quickstart_dag",
start_date=pendulum.datetime(2023, 1, 1, tz="UTC"),
schedule=None,
catchup=False,
tags=["mssql", "example"],
) as dag:
run_simple_query = MsSqlOperator(
task_id="run_select_statement",
mssql_conn_id="mssql_default",
sql="SELECT 1 as result;",
autocommit=True,
)