{"id":4925,"library":"dag-factory","title":"DAG Factory","description":"dag-factory is an open-source Python library that dynamically generates Apache Airflow DAGs from YAML configuration files. It enables users to define complex data pipelines using a declarative syntax, reducing the need for extensive Python knowledge and promoting consistency across many DAGs. The library is actively maintained by Astronomer, with the current stable version being 1.0.1, and receives regular updates and feature enhancements.","status":"active","version":"1.0.1","language":"en","source_language":"en","source_url":"https://github.com/astronomer/dag-factory","tags":["airflow","dag","yaml","automation","declarative","data-pipeline"],"install":[{"cmd":"pip install dag-factory","lang":"bash","label":"Base Installation"},{"cmd":"pip install dag-factory[all]","lang":"bash","label":"With all optional providers"},{"cmd":"pip install dag-factory[kubernetes]","lang":"bash","label":"With Kubernetes provider"}],"dependencies":[{"reason":"Core dependency for dynamic DAG generation. Requires >=2.4.","package":"apache-airflow","optional":false},{"reason":"Optional provider, previously enforced. Install if your DAGs use HTTP operations.","package":"apache-airflow-providers-http","optional":true},{"reason":"Optional provider, previously enforced. Install if your DAGs use KubernetesPodOperator.","package":"apache-airflow-providers-cncf-kubernetes","optional":true}],"imports":[{"note":"As of v1.0.0, the `DagFactory` class is private (`_DagFactory`), and `load_yaml_dags` is the recommended entry point for loading DAGs from YAML files.","wrong":"from dagfactory import DagFactory","symbol":"load_yaml_dags","correct":"from dagfactory import load_yaml_dags"}],"quickstart":{"code":"import os\nfrom dagfactory import load_yaml_dags\n\n# Define a simple YAML configuration for a DAG\n# In a real Airflow setup, this would be in a .yml file in your DAGs folder\n# e.g., dags/my_dag.yml\nyaml_config_content = '''\nmy_example_dag:\n  default_args:\n    owner: 'airflow'\n    start_date: '2023-01-01'\n    retries: 1\n  schedule: '@daily'\n  description: 'A simple example DAG from YAML'\n  tasks:\n    start_task:\n      operator: airflow.operators.bash.BashOperator\n      bash_command: 'echo \"Starting DAG!\"'\n    end_task:\n      operator: airflow.operators.bash.BashOperator\n      bash_command: 'echo \"DAG finished.\"'\n      dependencies: [start_task]\n'''\n\n# For demonstration, we'll write the YAML to a temporary file.\n# In a real Airflow environment, this file would be picked up by the scheduler.\ndags_folder = os.environ.get('AIRFLOW_HOME', './dags')\nos.makedirs(dags_folder, exist_ok=True)\nconfig_filepath = os.path.join(dags_folder, 'my_dag.yml')\n\nwith open(config_filepath, 'w') as f:\n    f.write(yaml_config_content)\n\n# Load DAGs from the YAML file(s) into Airflow's DAG Bag.\n# This Python file (e.g., dags/dag_generator.py) will be parsed by Airflow.\n# All YAML files in the dags_folder (or specified path) will be processed.\nload_yaml_dags(globals_dict=globals(), config_filepath=config_filepath)","lang":"python","description":"This quickstart demonstrates how to define an Airflow DAG using a YAML configuration and then use `dag-factory`'s `load_yaml_dags` function to generate it. The `load_yaml_dags` function is designed to be called within an Airflow DAG file, automatically populating the `globals()` dictionary with the generated DAGs, making them discoverable by the Airflow scheduler."},"warnings":[{"fix":"If your DAGs rely on these providers, you must explicitly install them, e.g., `pip install dag-factory[all]` or `pip install dag-factory[kubernetes]`.","message":"Airflow providers (`apache-airflow-providers-http`, `apache-airflow-providers-cncf-kubernetes`) are no longer automatically installed. They are now optional dependencies.","severity":"breaking","affected_versions":">=1.0.0"},{"fix":"Use the recommended function `from dagfactory import load_yaml_dags` to generate DAGs from your YAML configurations.","message":"The `DagFactory` class is now considered private (`_DagFactory`), and its direct import path (`from dagfactory import DagFactory`) has been removed.","severity":"breaking","affected_versions":">=1.0.0"},{"fix":"Use the `schedule` parameter instead to define DAG schedules.","message":"The `schedule_interval` parameter in YAML DAG configurations is no longer supported.","severity":"breaking","affected_versions":">=1.0.0"},{"fix":"Remove any calls to `example_dag_factory.clean_dags(globals())` from your DAG files. Rely on Airflow's native mechanisms for DAG lifecycle management.","message":"The `clean_dags()` method has been removed. DAG cleanup is now handled directly by Airflow's configuration (`AIRFLOW__DAG_PROCESSOR__REFRESH_INTERVAL`).","severity":"breaking","affected_versions":">=1.0.0"},{"fix":"Ensure your environment meets the minimum requirements: Python 3.9+ and Apache Airflow 2.4+.","message":"Support for older Airflow and Python versions has been dropped.","severity":"breaking","affected_versions":">=0.23.0"},{"fix":"Switch to Airflow's direct equivalents: `dagrun_timeout`, `retry_delay`, `sla`, `execution_delta`, `execution_timeout`. Ensure these are specified using `__type__: datetime.timedelta` for time-related values where applicable.","message":"Several inconsistent YAML parameters (e.g., `dagrun_timeout_sec`, `retry_delay_sec`, `sla_secs`, `execution_delta_secs`, `execution_timeout_secs`) have been removed.","severity":"breaking","affected_versions":">=1.0.0"},{"fix":"For Airflow 3.1.0 and above, `sla_miss_callback` is no longer supported directly within `dag-factory`. Airflow 3 deprecates SLA features in favor of 'deadline alerts'. Consider migrating to deadline alerts or Airflow's standard callback mechanisms.","message":"The `sla_miss_callback` parameter is removed from `dag_kwargs` for Airflow versions >= 3.1.0.","severity":"gotcha","affected_versions":">=1.0.1 (dag-factory), >=3.1.0 (Airflow)"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}