Apache Airflow IMAP Provider

3.11.1 · active · verified Thu Apr 09

The Apache Airflow IMAP Provider enables Airflow to interact with IMAP email servers. It provides hooks and operators for tasks such as retrieving email attachments. The current version is 3.11.1 and it follows the release cadence of Apache Airflow providers, which are released regularly to support new Airflow features and address bug fixes.

Warnings

Install

Imports

Quickstart

This example demonstrates how to use the `IMAPRetrieveAttachmentOperator` to fetch a PDF attachment from an IMAP email server. It assumes an Airflow connection named 'imap_default' is configured with the necessary IMAP server credentials. The operator filters emails by sender, subject, and retrieves only unseen emails matching a regex pattern for the attachment name, saving it to a specified directory.

import pendulum
import os
from airflow.models.dag import DAG
from airflow.providers.imap.operators.imap import IMAPRetrieveAttachmentOperator
from airflow.utils.dates import days_ago

# IMPORTANT: In a real Airflow deployment, you must configure an IMAP connection
# via the Airflow UI (Admin -> Connections) with a 'Conn Id', e.g., 'imap_default'.
# This connection should include Host, Port, Login (Username), Password, and
# potentially Extra parameters (e.g., 'ssl': True) for SSL/TLS if needed.

# For this example, we assume 'imap_default' is configured or will be.
# Using os.environ.get for sensitivity, though connection details are best in Airflow secrets backend.
IMAP_CONN_ID = os.environ.get("AIRFLOW_IMAP_CONN_ID", "imap_default")
TARGET_ATTACHMENT_DIRECTORY = os.environ.get("AIRFLOW_IMAP_TARGET_DIR", "/tmp/airflow_imap_attachments")

with DAG(
    dag_id="example_imap_retrieve_attachment",
    start_date=days_ago(1),
    schedule=None,
    catchup=False,
    tags=["imap", "email", "provider"],
) as dag:
    # Task to retrieve a specific PDF attachment from an IMAP server
    retrieve_monthly_report = IMAPRetrieveAttachmentOperator(
        task_id="retrieve_monthly_report_attachment",
        imap_conn_id=IMAP_CONN_ID,
        email_filter={
            "FROM": "reports@company.com",
            "SUBJECT": "Monthly Report",
            "UNSEEN": True, # Only process unseen emails
        },
        attachment_name="report_.*\\.pdf", # Use regex to match files like 'report_2023_01.pdf'
        target_directory=TARGET_ATTACHMENT_DIRECTORY,
        check_regex=True, # Enable regex matching for attachment_name
        # delete_after_fetch=True, # Uncomment to delete emails after successful retrieval
    )

view raw JSON →