Apache Airflow Oracle Provider
The Apache Airflow Oracle Provider enables Airflow to interact with Oracle databases. It includes hooks and operators to execute SQL commands, manage data, and transfer data between Oracle instances. As of version 4.5.2, it leverages the modern `oracledb` driver for Python 3.10+ and is actively maintained, with releases often aligned with Apache Airflow's own release cycle or independently for bug fixes and features.
Warnings
- breaking The Oracle provider, especially for Python 3.10+, now primarily relies on the `oracledb` library. If you were previously using `cx_oracle`, ensure you transition to `oracledb`.
- gotcha `oracledb` can operate in 'thin mode' (no client libraries required) or 'thick mode' (requires Oracle Client libraries like Instant Client). Thick mode is necessary for features like Connection Pooling, Two-Phase Commits, or advanced data types. Many users forget to configure client libraries for thick mode.
- gotcha Correctly configuring the Oracle Airflow Connection in the UI can be tricky. Pay attention to 'Schema' field, which can be either the Service Name or the SID, depending on your Oracle setup. Also, `TNS_ADMIN` environment variable can be used for TNS entry resolution but must be configured for the Airflow worker.
Install
-
pip install apache-airflow-providers-oracle
Imports
- OracleHook
from airflow.providers.oracle.hooks.oracle import OracleHook
- OracleOperator
from airflow.providers.oracle.operators.oracle import OracleOperator
- OracleToOracleOperator
from airflow.providers.oracle.transfers.oracle_to_oracle import OracleToOracleOperator
Quickstart
from __future__ import annotations
import pendulum
from airflow.models.dag import DAG
from airflow.providers.oracle.operators.oracle import OracleOperator
# Ensure an Airflow connection named 'oracle_default' exists with Oracle DB details.
# In the Airflow UI (Admin -> Connections -> +), create a new connection:
# - Conn Id: oracle_default
# - Conn Type: Oracle
# - Host: your_oracle_host
# - Port: 1521 (or your port)
# - Schema: your_service_name_or_sid
# - Login: your_username
# - Password: your_password
with DAG(
dag_id="oracle_example_dag",
start_date=pendulum.datetime(2023, 1, 1, tz="UTC"),
catchup=False,
schedule=None,
tags=["oracle", "example", "database"],
) as dag:
create_table = OracleOperator(
task_id="create_test_table",
oracle_conn_id="oracle_default",
sql="""
CREATE TABLE IF NOT EXISTS AIRFLOW_TEST_TABLE (
ID NUMBER PRIMARY KEY,
NAME VARCHAR2(50)
)
""",
autocommit=True,
)
insert_data = OracleOperator(
task_id="insert_data_into_table",
oracle_conn_id="oracle_default",
sql="""
INSERT INTO AIRFLOW_TEST_TABLE (ID, NAME) VALUES (1, 'Test User')
""",
autocommit=True,
)
create_table >> insert_data