Apache Airflow Oracle Provider

raw JSON →
4.5.2 verified Sat Apr 25 auth: no python

The Apache Airflow Oracle Provider enables Airflow to interact with Oracle databases. It includes hooks and operators to execute SQL commands, manage data, and transfer data between Oracle instances. As of version 4.5.2, it leverages the modern `oracledb` driver for Python 3.10+ and is actively maintained, with releases often aligned with Apache Airflow's own release cycle or independently for bug fixes and features.

pip install apache-airflow-providers-oracle
error ModuleNotFoundError: No module named 'oracledb'
cause The 'oracledb' module is not installed in the environment.
fix
Install the 'oracledb' module using pip: 'pip install oracledb'.
error ModuleNotFoundError: No module named 'airflow.providers.oracle'
cause The 'apache-airflow-providers-oracle' package is not installed in the Airflow environment.
fix
Install the 'apache-airflow-providers-oracle' package using pip: 'pip install apache-airflow-providers-oracle'.
error Cannot find Oracle connection in Apache Airflow UI
cause The Oracle provider is not installed, so the Oracle connection type is not available in the Airflow UI.
fix
Install the Oracle provider using pip: 'pip install apache-airflow-providers-oracle'.
error DPY-3010: connections to this database server version are not supported by python-oracledb in thin mode.
cause The `oracledb` driver, used by `apache-airflow-providers-oracle`, defaults to thin mode which might not support connections to older Oracle database versions (e.g., pre-12.1).
fix
Enable 'thick mode' for the Oracle connection, which requires the Oracle Instant Client. Set thick_mode=True in your OracleHook initialization or your Airflow connection's extra parameters (e.g., {"thick_mode": true}). Additionally, ensure the Oracle Instant Client libraries are accessible to your Airflow environment and configured correctly via oracledb.init_oracle_client(lib_dir="/path/to/instantclient").
error DPI-1047: Cannot locate a 64-bit Oracle Client library: "/opt/oracle/lib/libclntsh.so: cannot open shared object file: No such file or directory"
cause This error indicates that the `oracledb` driver, when operating in thick mode, cannot find the necessary Oracle Instant Client libraries at the specified path or in the system's library search paths.
fix
Install the 64-bit Oracle Instant Client and ensure its lib directory is correctly pointed to by setting the ORACLE_HOME environment variable, or by passing the lib_dir parameter to oracledb.init_oracle_client(). For Docker, make sure the Instant Client is unzipped correctly into the image and environment variables are set.
breaking The Oracle provider, especially for Python 3.10+, now primarily relies on the `oracledb` library. If you were previously using `cx_oracle`, ensure you transition to `oracledb`.
fix Uninstall `cx_oracle` if present (`pip uninstall cx_oracle`) and ensure `oracledb` is installed (`pip install oracledb`). The provider's `setup.py` usually handles this correctly on installation if `cx_oracle` is not already present.
gotcha `oracledb` can operate in 'thin mode' (no client libraries required) or 'thick mode' (requires Oracle Client libraries like Instant Client). Thick mode is necessary for features like Connection Pooling, Two-Phase Commits, or advanced data types. Many users forget to configure client libraries for thick mode.
fix For thick mode, download and install Oracle Instant Client, then ensure the appropriate `LD_LIBRARY_PATH` (Linux/macOS) or `PATH` (Windows) environment variable is set in the environment where Airflow workers run, or use `oracledb.init_oracle_client()` with the library directory.
gotcha Correctly configuring the Oracle Airflow Connection in the UI can be tricky. Pay attention to 'Schema' field, which can be either the Service Name or the SID, depending on your Oracle setup. Also, `TNS_ADMIN` environment variable can be used for TNS entry resolution but must be configured for the Airflow worker.
fix Verify with your DBA whether to use Service Name or SID. For Service Name, it's typically 'ServiceName' (e.g., `ORCLCDB`); for SID, it's 'SID' (e.g., `ORCL`). If using TNS, ensure the `TNS_ADMIN` environment variable is correctly set for the Airflow process and points to the directory containing your `tnsnames.ora` file.
runtime status import time mem disk
3.10-alpine
3.10-slim 3.41s 69.4MB 250M
3.11-alpine
3.11-slim 6.22s 75.5MB 270M
3.12-alpine
3.12-slim 6.07s 74.2MB 261M
3.13-alpine
3.13-slim 5.92s 74.8MB 263M
3.9-alpine
3.9-slim 5.61s 63.6MB 214M

This quickstart demonstrates a simple Airflow DAG using the `OracleOperator` to create a table and insert data into an Oracle database. Before running, configure an 'Oracle' connection in the Airflow UI with `Conn Id: oracle_default` and your Oracle database credentials.

from __future__ import annotations

import pendulum

from airflow.models.dag import DAG
from airflow.providers.oracle.operators.oracle import OracleOperator

# Ensure an Airflow connection named 'oracle_default' exists with Oracle DB details.
# In the Airflow UI (Admin -> Connections -> +), create a new connection:
# - Conn Id: oracle_default
# - Conn Type: Oracle
# - Host: your_oracle_host
# - Port: 1521 (or your port)
# - Schema: your_service_name_or_sid
# - Login: your_username
# - Password: your_password

with DAG(
    dag_id="oracle_example_dag",
    start_date=pendulum.datetime(2023, 1, 1, tz="UTC"),
    catchup=False,
    schedule=None,
    tags=["oracle", "example", "database"],
) as dag:
    create_table = OracleOperator(
        task_id="create_test_table",
        oracle_conn_id="oracle_default",
        sql="""
        CREATE TABLE IF NOT EXISTS AIRFLOW_TEST_TABLE (
            ID NUMBER PRIMARY KEY,
            NAME VARCHAR2(50)
        )
        """,
        autocommit=True,
    )

    insert_data = OracleOperator(
        task_id="insert_data_into_table",
        oracle_conn_id="oracle_default",
        sql="""
        INSERT INTO AIRFLOW_TEST_TABLE (ID, NAME) VALUES (1, 'Test User')
        """,
        autocommit=True,
    )

    create_table >> insert_data