Apache Airflow SFTP Provider

5.7.2 · active · verified Thu Apr 09

This provider package enables Apache Airflow to interact with SFTP servers, facilitating secure file transfers and management within Airflow DAGs. It includes operators and hooks for various SFTP operations like getting, putting, and listing files. The current version is 5.7.2, and new versions are released regularly as part of the Apache Airflow provider release cycle.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use the `SFTPOperator` to upload a file from a local path to an SFTP server and then download it back. It assumes an SFTP connection named `sftp_default` is configured in Airflow. For demonstration purposes, connection details can be provided via environment variables, though typically these are managed within the Airflow UI.

import os
from datetime import datetime

from airflow import DAG
from airflow.providers.sftp.operators.sftp import SFTPOperator, SFTPOperation

# Set default values for connection details via environment variables
SFTP_CONN_ID = os.environ.get('AIRFLOW_CONN_SFTP_DEFAULT', 'sftp_default')
REMOTE_HOST = os.environ.get('SFTP_REMOTE_HOST', 'localhost')
REMOTE_PORT = int(os.environ.get('SFTP_REMOTE_PORT', '22'))
REMOTE_USERNAME = os.environ.get('SFTP_REMOTE_USERNAME', 'sftpuser')
REMOTE_PASSWORD = os.environ.get('SFTP_REMOTE_PASSWORD', 'sftppassword')

# Ensure SFTP_CONN_ID is configured in Airflow UI or via environment variable
# Example for environment variable:
# export AIRFLOW_CONN_SFTP_DEFAULT='sftp://sftpuser:sftppassword@localhost:22/'

with DAG(
    dag_id='sftp_example_dag',
    start_date=datetime(2023, 1, 1),
    schedule_interval=None,
    catchup=False,
    tags=['sftp', 'example', 'file-transfer'],
) as dag:
    upload_file_task = SFTPOperator(
        task_id='upload_local_to_sftp',
        ssh_conn_id=SFTP_CONN_ID,
        local_filepath='/tmp/local_file_to_upload.txt',
        remote_filepath='/tmp/remote_uploaded_file.txt',
        operation=SFTPOperation.PUT,
        create_intermediate_dirs=True,
        # Optional: remote_host=REMOTE_HOST, port=REMOTE_PORT, username=REMOTE_USERNAME, password=REMOTE_PASSWORD
        # These are usually configured in the SFTP_CONN_ID
    )

    download_file_task = SFTPOperator(
        task_id='download_sftp_to_local',
        ssh_conn_id=SFTP_CONN_ID,
        local_filepath='/tmp/local_downloaded_file.txt',
        remote_filepath='/tmp/remote_uploaded_file.txt',
        operation=SFTPOperation.GET,
        # Optional: remote_host=REMOTE_HOST, port=REMOTE_PORT, username=REMOTE_USERNAME, password=REMOTE_PASSWORD
    )

    # For a real scenario, you'd create /tmp/local_file_to_upload.txt before running
    # Example of creating the file:
    # from airflow.operators.python import PythonOperator
    # create_local_file = PythonOperator(
    #    task_id='create_local_file',
    #    python_callable=lambda: open('/tmp/local_file_to_upload.txt', 'w').write('Hello from Airflow!'),
    # )
    # create_local_file >> upload_file_task

    upload_file_task >> download_file_task

view raw JSON →