Apache Airflow Hashicorp Provider
The Apache Airflow Hashicorp Provider package integrates Apache Airflow with Hashicorp products like Vault and Consul. It offers hooks, operators, and secrets backends for managing secrets and interacting with Hashicorp services. The current version is 4.5.2 and it follows a regular release cadence, often aligning with Apache Airflow's major and minor releases, as well as independent updates for features and bug fixes.
Common errors
-
ModuleNotFoundError: No module named 'airflow.contrib.hooks.vault_hook'
cause Attempting to import an Airflow provider component from the deprecated `airflow.contrib` package.fixUpdate imports to use the new provider path: `from airflow.providers.hashicorp.hooks.vault import HashicorpVaultHook`. -
hvac.exceptions.VaultError: (403, 'permission denied')
cause The Vault credentials (token, AppRole, etc.) used by the Airflow connection do not have sufficient permissions to access the specified secret path.fixVerify that the Vault token or authentication method used in the Airflow connection has the necessary policies attached to read from or write to the desired Vault path. -
airflow.exceptions.AirflowException: The conn_id `vault_default` isn't defined!
cause The Airflow connection ID specified in the operator or hook is not configured in Airflow.fixCreate a new 'Hashicorp Vault' connection in the Airflow UI (Admin -> Connections) or define it via environment variables (e.g., `AIRFLOW_CONN_VAULT_DEFAULT=vault://<token>@<host>:8200/`). -
hvac.exceptions.InvalidRequest: no handler for route 'v1/auth/kubernetes/login'
cause Trying to use Kubernetes authentication but the Kubernetes auth method is not enabled or configured on the Vault server at the expected path.fixEnsure the Kubernetes auth method is enabled and configured on your Vault server, and that the `kubernetes_mount_point` in your Airflow Vault connection (or `backend_kwargs`) matches its path.
Warnings
- breaking Airflow providers were refactored in Airflow 2.0+. All `airflow.contrib` imports for Hashicorp components are removed. Using old import paths will result in `ModuleNotFoundError`.
- gotcha Configuring the Vault secrets backend (`VaultBackend`) requires specific Airflow environment variables (`AIRFLOW__SECRETS__BACKEND`, `AIRFLOW__SECRETS__BACKEND_KWARGS`). Misconfiguration often leads to secrets not being fetched or authentication errors.
- gotcha Vault authentication can be complex, and issues often manifest as `hvac.exceptions.VaultError`. Common pitfalls include incorrect tokens, expired credentials, or misconfigured AppRole/Kubernetes authentication.
- gotcha The `VaultOperator` and `HashicorpVaultHook` rely on an Airflow connection. If the `vault_conn_id` specified in your DAG does not exist or is misconfigured, tasks will fail with connection errors.
Install
-
pip install apache-airflow-providers-hashicorp
Imports
- HashicorpVaultHook
from airflow.contrib.hooks.vault_hook import VaultHook
from airflow.providers.hashicorp.hooks.vault import HashicorpVaultHook
- VaultOperator
from airflow.contrib.operators.vault_operator import VaultOperator
from airflow.providers.hashicorp.operators.vault import VaultOperator
- VaultBackend
from airflow.providers.hashicorp.secrets.vault import VaultBackend
Quickstart
import os
from datetime import datetime
from airflow.models.dag import DAG
from airflow.providers.hashicorp.operators.vault import VaultOperator
# Ensure you have a 'vault_default' connection configured in Airflow with appropriate Vault address and authentication details.
# For local testing, you might need a local Vault instance and a token.
# Example: 'vault_default' connection type: 'Hashicorp Vault', Host: 'http://localhost:8200', Login: 'token', Password: 'your_vault_token'
with DAG(
dag_id='example_vault_read_secret',
start_date=datetime(2023, 1, 1),
schedule=None,
catchup=False,
tags=['vault', 'secrets'],
) as dag:
read_secret = VaultOperator(
task_id='read_my_secret',
vault_conn_id='vault_default', # Ensure this connection is configured
secret_path='secret/data/my-app/db-creds', # Example path, replace with your actual secret path
key='username', # The specific key within the secret to retrieve
result_key='retrieved_db_username', # XCom key to store the result
# Optional: You can specify an output_format, e.g., 'json' or 'plain'
)
# The retrieved value will be pushed to XCom under 'retrieved_db_username'
# You can access it in subsequent tasks like this:
# from airflow.decorators import task
# @task
# def use_secret_value(**kwargs):
# secret_value = kwargs['ti'].xcom_pull(task_ids='read_my_secret', key='retrieved_db_username')
# print(f"Retrieved DB Username: {secret_value}")
#
# use_secret_value()