Apache Airflow FTP Provider
The `apache-airflow-providers-ftp` package provides operators and hooks to interact with FTP and FTPS servers within Apache Airflow DAGs. It enables tasks like uploading, downloading, and deleting files from remote FTP/FTPS locations. The current version is 3.14.2, and provider packages release independently from core Airflow, typically for bug fixes, new features, or compatibility with new Airflow versions.
Warnings
- gotcha Incorrect Airflow connection configuration for FTP/FTPS. Users often use a generic 'HTTP' or 'SFTP' connection type, or fail to set the 'Conn Type' to 'FTP' for FTP/FTPS operations.
- gotcha Distinction between FTP and FTPS, and explicit vs. implicit FTPS. The `FTPOperator` and `FTPHook` default to FTP. For FTPS, you must use `FTPSFileTransferOperator` or `FTPSHook` and configure the connection accordingly.
- gotcha File transfer mode (binary vs. ASCII). Issues can arise when transferring text files in binary mode, or vice versa, especially across different operating systems with varying line endings.
- gotcha Large file transfers may encounter network timeouts, leading to task failures. The default FTP connection timeout might be too short for very large files or slow networks.
Install
-
pip install apache-airflow-providers-ftp
Imports
- FTPOperator
from apache_airflow_providers_ftp.operators.ftp import FTPOperator
- FTPHook
from apache_airflow_providers_ftp.hooks.ftp import FTPHook
- FTPSFileTransferOperator
from apache_airflow_providers_ftp.operators.ftps import FTPSFileTransferOperator
- FTPSHook
from apache_airflow_providers_ftp.hooks.ftps import FTPSHook
Quickstart
from __future__ import annotations
import pendulum
from airflow.decorators import dag
from airflow.operators.bash import BashOperator
from apache_airflow_providers_ftp.operators.ftp import FTPOperator
# Ensure you have an FTP connection configured in Airflow UI (Admin -> Connections)
# with conn_id='ftp_default'.
# Set: Conn Id = ftp_default, Conn Type = FTP
# Host: your_ftp_host (e.g., 'localhost' or 'ftp.example.com')
# Port: 21 (or 22 for SFTP if using an SFTP provider, 990 for implicit FTPS)
# Login: your_username
# Password: your_password
@dag(
dag_id="ftp_example_dag",
start_date=pendulum.datetime(2023, 10, 26, tz="UTC"),
catchup=False,
schedule=None,
tags=["ftp", "example", "provider"],
)
def ftp_dag():
# Create a dummy local file to upload
create_local_file = BashOperator(
task_id="create_local_file",
bash_command="echo 'Hello from Airflow FTP provider!' > /tmp/airflow_ftp_test.txt"
)
# Upload the file to FTP
upload_file_to_ftp = FTPOperator(
task_id="upload_file_to_ftp",
ftp_conn_id="ftp_default",
local_filepath="/tmp/airflow_ftp_test.txt",
remote_filepath="/remote_airflow_test.txt",
operation="put", # 'put' (upload), 'get' (download), 'delete' (delete remote)
create_intermediate_dirs=True,
)
# Clean up the local dummy file
clean_local_file = BashOperator(
task_id="clean_local_file",
bash_command="rm /tmp/airflow_ftp_test.txt"
)
create_local_file >> upload_file_to_ftp >> clean_local_file
ftp_dag()