Apache Airflow Provider for YDB
raw JSON → 2.5.2 verified Sat May 09 auth: no python
Apache Airflow provider that enables integration with Yandex Database (YDB). Current version is 2.5.2, supports Airflow 2.8+ and Python 3.10+. Released on a monthly cadence alongside Airflow.
pip install apache-airflow-providers-ydb Common errors
error airflow.exceptions.AirflowException: Failed to create YDB driver. Check your connection parameters. ↓
cause Connection 'ydb_default' is missing required extra fields 'endpoint' or 'database'.
fix
In Airflow UI, edit YDB connection and set Extra to: {"endpoint": "grpcs://ydb.serverless.yandexcloud.net:2135", "database": "/ru-central1/b1g..."}
error ModuleNotFoundError: No module named 'ydb' ↓
cause YDB SDK is not installed. The provider does not include it automatically.
fix
Run: pip install ydb
error TypeError: YdbOperator.__init__() got an unexpected keyword argument 'sql' ↓
cause Wrong version of Airflow provider package (older than v2.0.0) where operator used 'query' parameter.
fix
Upgrade to >=2.0.0: pip install -U apache-airflow-providers-ydb. Use 'sql' parameter.
error ydb.issues.Unavailable: cannot connect to database, session is not ready ↓
cause Connection timeout or network issue; often due to missing TLS certificate or wrong endpoint.
fix
Ensure endpoint is reachable and correct. If using self-signed certs, set extra: {"ssl_verify": false}.
Warnings
breaking In version 2.0.0, connection parameters changed from token-based to connection string. Existing connections must be updated. ↓
fix Update Airflow connection to use 'extra' field with 'endpoint' and 'database' keys instead of 'token'.
deprecated YdbToDWHOperator is deprecated as of v2.5.0, use YdbToClickhouseOperator or custom transfer. ↓
fix Migrate to YdbToClickhouseOperator for ClickHouse integration.
gotcha SSL verification is enabled by default; internal clusters may need SSL disabled in connection extra. ↓
fix Set 'ssl_verify': 'false' in the connection's extra JSON if using self-signed certificates.
Imports
- YdbHook wrong
import ydbcorrectfrom airflow.providers.ydb.hooks.ydb import YdbHook - YdbOperator wrong
from airflow.operators import YdbOperatorcorrectfrom airflow.providers.ydb.operators.ydb import YdbOperator - YdbToDWHOperator wrong
from airflow.transfers import YdbToDWHOperatorcorrectfrom airflow.providers.ydb.transfers.ydb_to_dwh import YdbToDWHOperator
Quickstart
from datetime import datetime
from airflow import DAG
from airflow.providers.ydb.operators.ydb import YdbOperator
default_args = {\n 'start_date': datetime(2024, 1, 1),\n 'conn_id': 'ydb_default'\n}
with DAG('ydb_dag', default_args=default_args, schedule_interval=None) as dag:\n task = YdbOperator(
task_id='execute_query',
sql='SELECT 1;',
ydb_conn_id='ydb_default'
)