Google Cloud Storage Transfer API Client Library

1.20.0 · active · verified Sun Mar 29

The `google-cloud-storage-transfer` library is the official Python client for the Google Cloud Storage Transfer Service. It enables programmatic control over data transfers to and from Google Cloud Storage, supporting various sources including other cloud providers and on-premises systems. Currently at version 1.20.0, this library adheres to Google Cloud's frequent release cadence, often receiving updates alongside other Python client libraries in the `googleapis/google-cloud-python` monorepo.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to create and immediately run a one-time transfer job to move data from a Google Cloud Storage source bucket to a Google Cloud Storage sink bucket. It requires enabling the Storage Transfer Service API, setting up Application Default Credentials, and ensuring the service account has appropriate permissions to both buckets. Replace placeholder variables with your Google Cloud Project ID and GCS bucket names.

import os
from google.cloud.storage_transfer import StorageTransferServiceClient

def create_and_run_gcs_to_gcs_transfer_job(
    project_id: str,
    source_bucket_name: str,
    sink_bucket_name: str,
    job_description: str,
):
    """Creates and runs a one-time transfer job between two GCS buckets."""
    client = StorageTransferServiceClient()

    # Transfer job configuration
    transfer_job = {
        "project_id": project_id,
        "description": job_description,
        "transfer_spec": {
            "gcs_data_source": {"bucket_name": source_bucket_name},
            "gcs_data_sink": {"bucket_name": sink_bucket_name},
        },
        "status": "ENABLED",  # Job is created in an enabled state
    }

    # Create the transfer job
    # The API might create the job as DISABLED and then ENABLE it, or directly ENABLED.
    # For a 'run now' quickstart, we often set it to ENABLED at creation.
    try:
        created_job = client.create_transfer_job(transfer_job=transfer_job)
        print(f"Created transfer job: {created_job.name}")

        # If the job is not already IN_PROGRESS (e.g., if set to ENABLED but not yet started)
        # we can explicitly run it. For a one-time job set to ENABLED, it should start automatically.
        # However, for demonstration, an explicit run call can be shown for clarity if desired
        # for existing jobs, but 'create' with ENABLED often triggers it immediately for one-time.
        print(f"Transfer job '{created_job.name}' initiated.")
        # For more complex job management (e.g., recurrent jobs), you'd interact more with job status and run_transfer_job

    except Exception as e:
        print(f"Error creating or running transfer job: {e}")

# Example usage (replace with your actual project and bucket names)
if __name__ == "__main__":
    # Ensure these environment variables are set or replace with actual values
    # For local development, `gcloud auth application-default login` often provides credentials.
    PROJECT_ID = os.environ.get("GCP_PROJECT_ID", "your-gcp-project-id")
    SOURCE_BUCKET = os.environ.get("GCP_SOURCE_BUCKET", "your-source-gcs-bucket")
    SINK_BUCKET = os.environ.get("GCP_SINK_BUCKET", "your-sink-gcs-bucket")
    JOB_DESCRIPTION = "My Python quickstart GCS to GCS transfer"

    if PROJECT_ID == "your-gcp-project-id" or SOURCE_BUCKET == "your-source-gcs-bucket" or SINK_BUCKET == "your-sink-gcs-bucket":
        print("Please set GCP_PROJECT_ID, GCP_SOURCE_BUCKET, and GCP_SINK_BUCKET environment variables or replace placeholder values.")
    else:
        create_and_run_gcs_to_gcs_transfer_job(
            PROJECT_ID, SOURCE_BUCKET, SINK_BUCKET, JOB_DESCRIPTION
        )

view raw JSON →