Apache Airflow Salesforce Provider

5.14.0 · active · verified Sun Apr 12

The `apache-airflow-providers-salesforce` package is an Apache Airflow provider that enables seamless interaction with the Salesforce platform. It offers operators, hooks, and sensors to perform various operations such as data extraction, data loading, and executing Salesforce API calls directly from Airflow Directed Acyclic Graphs (DAGs). The current version is 5.14.0, and the package maintains an active development cycle with frequent monthly or bi-monthly releases to introduce new features, improvements, and bug fixes.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates a basic Airflow DAG using the `SalesforceToTableOperator` to extract data from Salesforce and shows how `SalesforceHook` can be used directly for more custom interactions. Before running, configure a Salesforce connection in the Airflow UI with `Conn Id: salesforce_default` and provide necessary credentials (username, password, security token, login URL).

import os
from datetime import datetime
from airflow.models.dag import DAG
from airflow.providers.salesforce.hooks.salesforce import SalesforceHook
from airflow.providers.salesforce.operators.salesforce import SalesforceToTableOperator

with DAG(
    dag_id='salesforce_example_dag',
    start_date=datetime(2023, 1, 1),
    schedule_interval=None,
    catchup=False,
    tags=['salesforce', 'example'],
) as dag:
    # Example of using SalesforceToTableOperator to extract data
    extract_contacts = SalesforceToTableOperator(
        task_id='extract_salesforce_contacts',
        salesforce_conn_id='salesforce_default', # Ensure this connection is configured in Airflow UI
        sobjects=['Contact'],
        table_name='airflow_contacts',
        selected_fields=['Id', 'FirstName', 'LastName', 'Email'],
        where_clause="MailingState = 'CA'",
        # Optional: You might push to a database table or another destination
        # E.g., postgres_conn_id='postgres_default'
        # For this example, we'll just demonstrate extraction.
    )

    # Example of using SalesforceHook directly (e.g., in a PythonOperator)
    def get_salesforce_data(**kwargs):
        hook = SalesforceHook(salesforce_conn_id='salesforce_default')
        sf_client = hook.get_conn()
        # Query the Salesforce API directly
        records = sf_client.query("SELECT Id, Name FROM Account LIMIT 5")
        print(f"Fetched {len(records['records'])} Account records.")
        return records['records']

    # You'd typically use a PythonOperator to wrap the hook usage
    # from airflow.operators.python import PythonOperator
    # fetch_data_task = PythonOperator(
    #     task_id='fetch_data_with_hook',
    #     python_callable=get_salesforce_data,
    # )
    # extract_contacts >> fetch_data_task

view raw JSON →