DAG Factory

1.0.1 · active · verified Sun Apr 12

dag-factory is an open-source Python library that dynamically generates Apache Airflow DAGs from YAML configuration files. It enables users to define complex data pipelines using a declarative syntax, reducing the need for extensive Python knowledge and promoting consistency across many DAGs. The library is actively maintained by Astronomer, with the current stable version being 1.0.1, and receives regular updates and feature enhancements.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to define an Airflow DAG using a YAML configuration and then use `dag-factory`'s `load_yaml_dags` function to generate it. The `load_yaml_dags` function is designed to be called within an Airflow DAG file, automatically populating the `globals()` dictionary with the generated DAGs, making them discoverable by the Airflow scheduler.

import os
from dagfactory import load_yaml_dags

# Define a simple YAML configuration for a DAG
# In a real Airflow setup, this would be in a .yml file in your DAGs folder
# e.g., dags/my_dag.yml
yaml_config_content = '''
my_example_dag:
  default_args:
    owner: 'airflow'
    start_date: '2023-01-01'
    retries: 1
  schedule: '@daily'
  description: 'A simple example DAG from YAML'
  tasks:
    start_task:
      operator: airflow.operators.bash.BashOperator
      bash_command: 'echo "Starting DAG!"'
    end_task:
      operator: airflow.operators.bash.BashOperator
      bash_command: 'echo "DAG finished."'
      dependencies: [start_task]
'''

# For demonstration, we'll write the YAML to a temporary file.
# In a real Airflow environment, this file would be picked up by the scheduler.
dags_folder = os.environ.get('AIRFLOW_HOME', './dags')
os.makedirs(dags_folder, exist_ok=True)
config_filepath = os.path.join(dags_folder, 'my_dag.yml')

with open(config_filepath, 'w') as f:
    f.write(yaml_config_content)

# Load DAGs from the YAML file(s) into Airflow's DAG Bag.
# This Python file (e.g., dags/dag_generator.py) will be parsed by Airflow.
# All YAML files in the dags_folder (or specified path) will be processed.
load_yaml_dags(globals_dict=globals(), config_filepath=config_filepath)

view raw JSON →