{"id":7437,"library":"mwaa-dr","title":"MWAA Disaster Recovery Solution (mwaa-dr)","description":"mwaa-dr is a Python library that provides a reusable framework for implementing disaster recovery solutions for Amazon Managed Workflows for Apache Airflow (MWAA). It simplifies the creation of Airflow DAGs for exporting and importing MWAA metadata, enabling backup and restore capabilities for critical Airflow components like variables, connections, and DAG run history. The library currently supports various MWAA versions, with the latest PyPI release being 2.2.0, and development is ongoing with updates to support newer Airflow versions.","status":"active","version":"2.2.0","language":"en","source_language":"en","source_url":"https://github.com/aws-samples/mwaa-disaster-recovery.git","tags":["aws","mwaa","airflow","disaster-recovery","backup","restore"],"install":[{"cmd":"pip install mwaa-dr","lang":"bash","label":"PyPI"},{"cmd":"# Add to your MWAA requirements.txt file:\nmwaa-dr==2.2.0","lang":"bash","label":"MWAA Environment"}],"dependencies":[{"reason":"This library generates DAGs for Apache Airflow within an MWAA environment.","package":"apache-airflow","optional":false}],"imports":[{"note":"Replace X_Y with your specific MWAA/Airflow version (e.g., v_2_10 for Airflow 2.10.x). The factory class is version-specific.","symbol":"DRFactory_X_Y","correct":"from mwaa_dr.v_X_Y.dr_factory import DRFactory_X_Y"}],"quickstart":{"code":"import os\nfrom airflow import DAG\nfrom airflow.utils.dates import days_ago\nfrom mwaa_dr.v_2_10.dr_factory import DRFactory_2_10\n\n# Ensure DR_BACKUP_BUCKET Airflow Variable is set in your MWAA environment\n# and MWAA execution role has read/write permissions on it.\n# Example: DR_BACKUP_BUCKET = 'your-mwaa-backup-bucket'\n\n# Initialize the DRFactory for your MWAA/Airflow version\n# For local testing with aws-mwaa-local-runner, use storage_type='LOCAL_FS'\n# and create a 'data' folder in your dags directory.\nfactory = DRFactory_2_10(\n    dag_id='backup_metadata_example',\n    path_prefix='data', # Relative path within the S3 bucket or local_fs\n    storage_type='S3' # Or 'LOCAL_FS' for local development\n)\n\n# Create a backup DAG\nbackup_dag: DAG = factory.create_backup_dag(\n    schedule_interval='@daily', # Example schedule\n    start_date=days_ago(1)\n)\n\n# Create a restore DAG (typically disabled by default, meant for manual trigger)\nrestore_dag: DAG = factory.create_restore_dag(\n    dag_id='restore_metadata_example',\n    start_date=days_ago(1),\n    is_paused_upon_creation=True # Recommended for restore DAGs\n)\n\n# Create a cleanup DAG (for emptying metadata tables before restore, use with caution)\ncleanup_dag: DAG = factory.create_cleanup_dag(\n    dag_id='cleanup_metadata_example',\n    start_date=days_ago(1),\n    is_paused_upon_creation=True # Recommended for cleanup DAGs\n)","lang":"python","description":"This quickstart demonstrates how to use `mwaa-dr` to create a daily metadata backup DAG and a manually triggered restore DAG for an MWAA environment running Apache Airflow 2.10.x. It also shows how to create a cleanup DAG. Before running, ensure you have an S3 bucket configured for backups and the `DR_BACKUP_BUCKET` Airflow variable is set in your MWAA environment. The MWAA execution role must have appropriate S3 permissions."},"warnings":[{"fix":"Monitor the `mwaa-dr` GitHub repository for updates and official support for Airflow 3.0. Ensure your Airflow environment is on version 2.10.x for the best compatibility path to future Airflow 3.x upgrades.","message":"Direct metadata database access from Airflow workers is being removed in Apache Airflow 3.x. mwaa-dr needs updates to support Airflow 3.0.","severity":"breaking","affected_versions":"MWAA versions supporting Airflow 3.0 and above"},{"fix":"Always import `DRFactory_X_Y` where `X_Y` matches your exact MWAA/Airflow environment version (e.g., `from mwaa_dr.v_2_10.dr_factory import DRFactory_2_10` for MWAA 2.10.x). The supported versions are listed in the `mwaa-disaster-recovery` GitHub README.","message":"The import path for `DRFactory` is version-specific to your MWAA/Airflow environment. Using the wrong version (e.g., `DRFactory_2_5` for an Airflow 2.10.3 environment) will lead to import errors or unexpected behavior.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Always use the `cleanup_metadata` DAG (created via `factory.create_cleanup_dag()`) to clear the target MWAA metadata tables *before* running a restore. Ensure this DAG is paused upon creation and triggered manually only when necessary. Review the tables backed up by default and consider overriding `dr_factory.setup_tables()` for custom table sets.","message":"For metadata restore to work correctly, the target database usually needs to be empty to avoid foreign key constraint violations. The solution provides a `cleanup_metadata` DAG for this purpose, which should be used with extreme caution.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Set these Airflow variables to `DO_NOTHING`, `APPEND`, or `REPLACE` based on your disaster recovery strategy. If using AWS Secrets Manager for variables/connections, set them to `DO_NOTHING` to prevent `mwaa-dr` from restoring from S3 backup.","message":"Airflow variables `DR_VARIABLE_RESTORE_STRATEGY` and `DR_CONNECTION_RESTORE_STRATEGY` control how variables and connections are restored. Incorrect settings can lead to unintended overwrites or data loss, especially if using AWS Secrets Manager.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Ensure `mwaa-dr` is added to your MWAA environment's `requirements.txt` file (e.g., `mwaa-dr==2.2.0`). Verify that `X_Y` in the import statement (`from mwaa_dr.v_X_Y.dr_factory import DRFactory_X_Y`) exactly matches your MWAA environment's Apache Airflow version.","cause":"The Python environment where the DAG is being parsed or run does not have the `mwaa-dr` library installed, or the version-specific import path is incorrect.","error":"ModuleNotFoundError: No module named 'mwaa_dr.v_X_Y'"},{"fix":"If you are customizing the tables to be backed up, ensure that essential tables like `xcom` and `task_instance` are included. If experiencing this with default settings, ensure you are using the latest `mwaa-dr` version and consider reporting the issue if it persists. Review the default `setup_tables()` method in `BaseDRFactory` for your MWAA version.","cause":"This can occur if the default list of tables to be backed up/restored (which includes `xcom` and `task_instance`) is modified or incomplete, leading to inconsistencies. A specific bug was reported for `create_backup_dag` if `xcom` and `task_instance` are excluded.","error":"IndexError: tuple index out of range (or similar errors during backup/restore)"},{"fix":"Before performing a restore operation, ensure the target MWAA metadata database is clean. Use the `cleanup_metadata` DAG (created via `factory.create_cleanup_dag()`) to empty the necessary tables. Always exercise caution when running cleanup DAGs.","cause":"Attempting to restore metadata into an MWAA database that is not empty, leading to conflicts with existing entries or their dependencies. This was specifically reported for Airflow 2.8.1.","error":"Foreign Key Violation (e.g., ForeignKeyViolation: insert or update on table 'dag_run' violates foreign key constraint 'task_instance_log_template_id_fkey')"},{"fix":"Verify that your MWAA execution role has `s3:GetObject`, `s3:PutObject`, `s3:DeleteObject`, and `s3:ListBucket` permissions on the designated S3 backup bucket (`DR_BACKUP_BUCKET`) and its contents.","cause":"The MWAA execution role associated with your environment lacks the necessary S3 permissions to read from or write to the configured backup S3 bucket.","error":"An error occurred (AccessDenied) when calling the GetObject operation (or similar S3 permission errors)"},{"fix":"Review MWAA networking prerequisites. For private routing, ensure necessary VPC service endpoints (S3, Monitoring, ECR) are configured. For public routing, ensure correct public/private subnet setup and internet gateway. Use the `AWSSupport-TroubleshootMWAAEnvironmentCreation` runbook if an environment is stuck during creation.","cause":"Although not directly `mwaa-dr` specific, an improperly configured MWAA environment (VPC, subnets, security groups, NAT gateway, VPC endpoints) can prevent the environment from starting or updating, which in turn affects `mwaa-dr` DAG deployment and execution.","error":"MWAA environment stuck in 'Creating' or 'Updating' state due to networking issues."}]}