Apache Airflow Snowflake Provider

raw JSON →
6.12.0 verified Tue May 12 auth: no python install: stale

The `apache-airflow-providers-snowflake` library is an official Apache Airflow provider package that enables seamless interaction with Snowflake Data Cloud from Airflow DAGs. It includes hooks, operators, and transfers for executing SQL queries, managing data, and leveraging Snowflake-specific features. The current version is 6.12.0, and it follows the Airflow provider release cadence, with frequent updates introducing new features and bug fixes.

pip install apache-airflow-providers-snowflake
error ModuleNotFoundError: No module named 'airflow.providers.snowflake'
cause The 'apache-airflow-providers-snowflake' package is not installed or not accessible in the Python environment.
fix
Install the package using 'pip install apache-airflow-providers-snowflake'.
error ImportError: cannot import name 'SnowflakeHook' from 'airflow.contrib.hooks.snowflake_hook'
cause The 'SnowflakeHook' has been moved from 'airflow.contrib.hooks.snowflake_hook' to 'airflow.providers.snowflake.hooks.snowflake' in newer versions of Airflow.
fix
Update the import statement to 'from airflow.providers.snowflake.hooks.snowflake import SnowflakeHook'.
error ImportError: cannot import name 'SnowflakeOperator' from 'airflow.contrib.operators.snowflake_operator'
cause The 'SnowflakeOperator' has been relocated from 'airflow.contrib.operators.snowflake_operator' to 'airflow.providers.snowflake.operators.snowflake' in recent Airflow versions.
fix
Modify the import statement to 'from airflow.providers.snowflake.operators.snowflake import SnowflakeOperator'.
error ModuleNotFoundError: No module named 'snowflake'
cause The 'snowflake-connector-python' package, required for Snowflake integration, is not installed.
fix
Install the package using 'pip install snowflake-connector-python'.
error ImportError: cannot import name 'CopyFromExternalStageToSnowflakeOperator' from 'airflow.providers.snowflake.transfers.copy_into_snowflake'
cause The 'CopyFromExternalStageToSnowflakeOperator' has been moved to 'airflow.providers.snowflake.transfers.s3_to_snowflake' in newer versions.
fix
Update the import statement to 'from airflow.providers.snowflake.transfers.s3_to_snowflake import CopyFromExternalStageToSnowflakeOperator'.
breaking The minimum supported Apache Airflow version for `apache-airflow-providers-snowflake` 6.12.0 is 2.11.0. Older provider versions had different Airflow minimums (e.g., 2.1.0+, 2.2.0+, 2.3.0+). Ensure your Airflow installation meets this requirement to avoid compatibility issues.
fix Upgrade your Apache Airflow instance to version 2.11.0 or higher. Refer to the Airflow upgrade guide for detailed instructions.
breaking In provider versions 4.x and above, the `SnowflakeHook`'s `run` method now conforms to the `DBApiHook` semantics, returning a sequence of sequences (DbApi-compatible results) instead of a dictionary of { 'column': 'value' }. This change affects how results are processed.
fix Adjust your DAGs to expect a sequence of sequences from `SnowflakeHook.run()` or `SnowflakeOperator`'s `handler` function. For example, iterate through `cursor.fetchall()` directly.
breaking As of provider version 6.3.0, the `private_key_content` field in Snowflake connections using key-pair authentication is expected to be a base64 encoded string. Existing connections with unencoded private key content will break.
fix Base64 encode your private key content when configuring Snowflake connections in Airflow UI or environment variables. Update existing connections to use the base64 encoded string.
gotcha When upgrading, ensure `snowflake-connector-python` and `snowflake-sqlalchemy` versions are compatible with your `apache-airflow-providers-snowflake` and `apache-airflow` versions. Conflicts can lead to `ModuleNotFoundError` or unexpected behavior (e.g., `sqlalchemy.sql.roles` issues with `snowflake-sqlalchemy==1.2.5`).
fix Always check the `requirements` section in the official documentation for specific compatible versions. For `snowflake-sqlalchemy` issues, downgrading to `1.2.4` has resolved conflicts with older `sqlalchemy` versions in the past.
gotcha If you encounter 'No module named 'Snowflake'' errors, especially when a `snowflake.py` file is present in your DAGs folder or a conflicting location, it might be due to a Python import path conflict where your local file shadows the installed provider module.
fix Rename any custom Python files named `snowflake.py` to avoid conflicts with the `airflow.providers.snowflake` package structure.
deprecated In versions 6.x and later, all deprecated classes, parameters, and features have been removed from the Snowflake provider package. This includes removal of `apply_default` decorator.
fix Update your DAGs and custom code to use the current, non-deprecated classes and parameters. Refer to the changelog for specific removals if migrating from older versions.
gotcha The `SnowflakeOperator`'s `autocommit` parameter defaults to `True`. However, if you explicitly want autocommit behavior, or are migrating from very old versions, be aware of changes in `common.sql` provider that might affect `autocommit` behavior in some contexts. It is recommended to use `SQLExecuteQueryOperator` for better control over transactions for complex SQL.
fix Explicitly set `autocommit=True` in `SnowflakeOperator` if that's the desired behavior. For more robust transactional control, consider using `SQLExecuteQueryOperator` from `apache-airflow-providers-common-sql`.
breaking The `apache-airflow-providers-snowflake` package, specifically the `SnowflakeSqlApiHook`, now requires the `aiohttp` library. If `aiohttp` is not installed, importing or using Snowflake components will result in a `ModuleNotFoundError: No module named 'aiohttp'`.
fix Install the `aiohttp` package in your Airflow environment (e.g., `pip install apache-airflow-providers-snowflake[aiohttp]` or `pip install aiohttp`).
breaking When installing `snowflake-connector-python` in minimal Linux environments (like Alpine Linux, often used in Docker images), the installation may fail with `error: command 'g++' failed: No such file or directory`. This occurs because `snowflake-connector-python` requires a C/C++ compiler (like g++) to build its C extensions (e.g., for Nanoarrow), and these build tools are not included by default in such minimal distributions.
fix Before installing `snowflake-connector-python`, ensure build essential packages are installed in your environment. For Alpine Linux, add `python3-dev` and `build-base` packages: `apk add --no-cache python3-dev build-base`.
python os / libc status wheel install import disk
3.10 alpine (musl) build_error - - - -
3.10 alpine (musl) - - - -
3.10 slim (glibc) wheel 38.5s - 595M
3.10 slim (glibc) - - - -
3.11 alpine (musl) build_error - - - -
3.11 alpine (musl) - - - -
3.11 slim (glibc) wheel 37.0s - 640M
3.11 slim (glibc) - - - -
3.12 alpine (musl) build_error - - - -
3.12 alpine (musl) - - - -
3.12 slim (glibc) wheel 32.0s - 627M
3.12 slim (glibc) - - - -
3.13 alpine (musl) build_error - - - -
3.13 alpine (musl) - - - -
3.13 slim (glibc) wheel 32.3s - 628M
3.13 slim (glibc) - - - -
3.9 alpine (musl) build_error - - - -
3.9 alpine (musl) - - - -
3.9 slim (glibc) wheel 43.5s - 546M
3.9 slim (glibc) - - - -

This quickstart DAG demonstrates how to create a Snowflake connection in Airflow and use the `SnowflakeOperator` to execute SQL commands. It first defines a DAG, then creates a table, inserts sample data, and finally queries the data, printing the results to the task logs. Ensure your Airflow environment has a Snowflake connection named `snowflake_default` (or the value of `SNOWFLAKE_CONN_ID` environment variable) configured with appropriate credentials (e.g., account, login, password, warehouse, database, schema).

from __future__ import annotations
import os
import pendulum

from airflow.models.dag import DAG
from airflow.providers.snowflake.operators.snowflake import SnowflakeOperator

SNOWFLAKE_CONN_ID = os.environ.get('SNOWFLAKE_CONN_ID', 'snowflake_default')

with DAG(
    dag_id='snowflake_quickstart_example',
    start_date=pendulum.datetime(2023, 1, 1, tz='UTC'),
    catchup=False,
    schedule=None,
    tags=['snowflake', 'example'],
    doc_md="""### Snowflake Quickstart DAG
    This DAG demonstrates a basic interaction with Snowflake using the SnowflakeOperator.
    It creates a table, inserts data, and then queries the table.
    """,
) as dag:
    create_table = SnowflakeOperator(
        task_id='create_snowflake_table',
        snowflake_conn_id=SNOWFLAKE_CONN_ID,
        sql=(
            "CREATE TABLE IF NOT EXISTS AIRFLOW_TEST_TABLE (id INTEGER, name VARCHAR);"
        ),
    )

    insert_data = SnowflakeOperator(
        task_id='insert_data_into_table',
        snowflake_conn_id=SNOWFLAKE_CONN_ID,
        sql="INSERT INTO AIRFLOW_TEST_TABLE (id, name) VALUES (1, 'Airflow'), (2, 'Snowflake');",
    )

    query_data = SnowflakeOperator(
        task_id='query_snowflake_table',
        snowflake_conn_id=SNOWFLAKE_CONN_ID,
        sql="SELECT * FROM AIRFLOW_TEST_TABLE;",
        handler=lambda cursor: [print(row) for row in cursor.fetchall()],
    )

    # Define task dependencies
    create_table >> insert_data >> query_data