Apache Airflow GitHub Provider
raw JSON → 2.11.2 verified Thu Apr 16 auth: no python
The `apache-airflow-providers-github` package provides Apache Airflow integrations for interacting with GitHub. It enables users to perform various GitHub operations, such as managing repositories, issues, pull requests, and other Git-related tasks directly from Airflow workflows. This active provider, currently at version 2.11.2, is part of Airflow's modular provider ecosystem, allowing for independent releases and updates from the core Airflow project.
pip install apache-airflow-providers-github Common errors
error ModuleNotFoundError: No module named 'airflow.providers.github.operators.github' ↓
cause The `apache-airflow-providers-github` package is not installed in the Airflow environment, or Python cannot find it.
fix
Run
pip install apache-airflow-providers-github to install the provider. If running in a Dockerized environment, rebuild your Docker image after adding the package to your requirements.txt or install command. error Failed to execute GithubOperator, error: 401 {'message': 'Bad credentials' ...} ↓
cause The GitHub connection (e.g., `github_default`) uses a Personal Access Token that is invalid, expired, or does not have the necessary scopes (permissions) for the operation being attempted.
fix
In the Airflow UI, navigate to Admin -> Connections, edit your GitHub connection, and ensure the 'GitHub Access Token' (password field) is a valid, active token with all required scopes (e.g., 'repo' scope for creating issues or interacting with repositories).
error Failed to execute GithubOperator, error: 404 {'message': 'Not Found' ...} ↓
cause The specified GitHub repository, user, or resource (e.g., a specific issue number or tag) does not exist or is inaccessible to the authenticated GitHub user/app.
fix
Verify the
repository_name, assignees, tag_name, or other resource-specific parameters in your operator or sensor are correct and accessible by the GitHub account associated with the Airflow connection. Warnings
breaking Provider versions frequently update their minimum required Apache Airflow version. Installing a new provider version with an older Airflow core can lead to incompatibility errors or unexpected behavior. ↓
fix Always check the `apache-airflow-providers-github` documentation for the `Requirements` section to verify the compatible Airflow version. Upgrade `apache-airflow` to the specified minimum version or install an older, compatible provider version.
gotcha Incorrect GitHub connection configuration, especially regarding authentication. Users often provide insufficient permissions for the Personal Access Token (PAT) or incorrectly configure GitHub App authentication details. ↓
fix Ensure the GitHub connection in Airflow (Admin -> Connections) is of `Connection Type: GitHub`. For PAT-based authentication, the 'GitHub Access Token' (password field) must have the necessary scopes (e.g., 'repo' for issue management). For GitHub App authentication, correctly populate the 'Extras' field with `key_path`, `app_id`, and `installation_id`.
gotcha Common `PyGithub` exceptions wrapped by `AirflowException` when interacting with the GitHub API. This usually indicates an issue with API calls such as invalid repository names, non-existent users for assignment, or insufficient permissions on the GitHub side. ↓
fix Review the Airflow task logs for the underlying `GithubException` message. Verify the parameters passed to the operator (e.g., `repository_name`, `assignees`, `labels`) are valid within your GitHub context and that the configured GitHub connection has the appropriate permissions for the action being performed.
Imports
- GithubOperator
from airflow.providers.github.operators.github import GithubOperator - GithubSensor wrong
from airflow.providers.github.github.sensors.github import GithubSensorcorrectfrom airflow.providers.github.sensors.github import GithubSensor - GithubHook
from airflow.providers.github.hooks.github import GithubHook
Quickstart
import os
from datetime import datetime
from airflow.models.dag import DAG
from airflow.providers.github.operators.github import GithubOperator
with DAG(
dag_id='github_create_issue_example',
start_date=datetime(2023, 1, 1),
schedule=None,
catchup=False,
tags=['github', 'example', 'issue'],
doc_md="""### GitHub Create Issue Example
This DAG demonstrates creating a GitHub issue using the GithubOperator.
Ensure you have a GitHub connection configured with `conn_id='github_default'`
and a Personal Access Token with 'repo' scope.
Environment variables:
- GITHUB_REPO_OWNER: Owner of the repository (e.g., 'apache')
- GITHUB_REPO_NAME: Name of the repository (e.g., 'airflow')
""",
) as dag:
create_github_issue = GithubOperator(
task_id='create_github_issue',
github_conn_id='github_default',
method_name='create_issue',
repository_name=f"{os.environ.get('GITHUB_REPO_OWNER', 'test-user')}/{os.environ.get('GITHUB_REPO_NAME', 'test-repo')}",
title='Airflow created issue from DAG',
body='This issue was automatically created by an Airflow DAG.',
assignees=['{AIRFLOW_GITHUB_ASSIGNEE}'], # Optional: Replace with a valid GitHub username for your repo
labels=['bug', 'airflow-automation'],
# Optional: You can pass other arguments supported by PyGithub's create_issue method
# e.g., 'milestone': 1,
)