Apache Airflow Samba Provider
The `apache-airflow-providers-samba` library provides Apache Airflow operators and hooks for interacting with Samba (SMB/CIFS) file shares. It enables users to perform file operations like reading, writing, moving, and deleting files on Samba servers directly within Airflow DAGs. The current version is 4.12.5, and it follows a regular release cadence as part of the Apache Airflow provider ecosystem, with versioning independent of Airflow core but with specific minimum Airflow version requirements.
Warnings
- breaking Provider versions have specific minimum Apache Airflow core version requirements. For example, `apache-airflow-providers-samba` version 4.12.x requires Airflow 2.11+ due to internal API changes like the removal of the `apply_default` decorator.
- gotcha When interacting with Windows Samba shares, using forward slashes (/) in paths may lead to `STATUS_INVALID_PARAMETER` errors. The underlying `smbclient` library often expects backslashes (\) or proper handling of path types.
- deprecated The old import path `airflow.hooks.samba_hook` has been deprecated.
- gotcha Some users have reported issues with Kerberos authentication (`SpnegoError`) when using `SambaHook`, even when direct `smbclient` commands work outside Airflow.
Install
-
pip install apache-airflow-providers-samba
Imports
- SambaHook
from airflow.providers.samba.hooks.samba import SambaHook
- SambaOperator
from airflow.providers.samba.operators.samba import SambaOperator
Quickstart
from __future__ import annotations
import pendulum
from airflow.models.dag import DAG
from airflow.providers.samba.operators.samba import SambaOperator
with DAG(
dag_id="samba_file_operations_example",
start_date=pendulum.datetime(2023, 1, 1, tz="UTC"),
catchup=False,
schedule=None,
tags=["samba", "file_transfer"],
) as dag:
# This task assumes an Airflow Connection named 'samba_default' is configured.
# Configure a connection in the Airflow UI:
# Conn Id: samba_default
# Conn Type: Samba
# Host: <Samba Server IP/Hostname>
# Login: <Samba Username>
# Password: <Samba Password>
# Schema: <Optional default share name, e.g., 'share'>
# Extra: {"share_type": "posix"} or {"share_type": "windows"}
move_file_task = SambaOperator(
task_id="move_samba_file",
samba_conn_id="samba_default",
source_path="/source/path/file.txt",
destination_path="/destination/path/file.txt",
operation="move", # Other supported operations: 'read', 'write', 'delete'
)