Google Cloud BigQuery Data Transfer
The Google Cloud BigQuery Data Transfer API client library, currently at version 3.21.0, allows users to programmatically manage scheduled data transfers from various partner SaaS applications and other Google services into Google BigQuery. It enables automation of ETL processes and data replication. The `google-cloud-python` repository, which contains this library, generally follows a regular release cadence with frequent updates across its client libraries.
Warnings
- breaking Future versions of Google Cloud client libraries, including `google-cloud-bigquery-datatransfer`, will drop support for Python 3.7 and 3.8. Python 3.7 reached end-of-life (EOL) in June 2023, and 3.8 in October 2024. New major versions of the library will be incompatible with these EOL Python runtimes.
- gotcha BigQuery Data Transfer Service's data source connectors (e.g., Google Ads, Display & Video 360) periodically undergo API version upgrades. These upgrades can introduce schema changes, such as new, modified, or deprecated columns, which may affect existing transfer configurations and downstream data processing pipelines.
- gotcha Authentication requires appropriate IAM roles. For service accounts creating transfers, 'BigQuery Admin' and 'BigQuery Data Transfer Service Agent' roles are often necessary, in addition to permissions to access the source data. Incorrect permissions are a common cause of transfer failures.
- gotcha Effective February 1, 2026, BigQuery requests that read data from multi-region Cloud Storage buckets will incur Cloud Storage multi-region data transfer fees. This can significantly impact billing for transfers involving data stored across different regions.
- deprecated BigQuery will limit the use of Legacy SQL starting June 1, 2026. If a project has not used Legacy SQL between November 1, 2025, and June 1, 2026, it will no longer be able to use it. Existing workloads might continue, but new ones may fail.
Install
-
pip install google-cloud-bigquery-datatransfer
Imports
- DataTransferServiceClient
from google.cloud import bigquery_datatransfer_v1 client = bigquery_datatransfer_v1.DataTransferServiceClient()
Quickstart
import os
from google.cloud import bigquery_datatransfer_v1
# Set your Google Cloud Project ID and Location
# For local development, set the GOOGLE_APPLICATION_CREDENTIALS environment variable.
# Example: export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/key.json"
project_id = os.environ.get("GOOGLE_CLOUD_PROJECT", "your-project-id")
location = "us"
client = bigquery_datatransfer_v1.DataTransferServiceClient()
parent = client.location_path(project_id, location)
try:
print(f"Listing data sources in project '{project_id}' and location '{location}':")
for data_source in client.list_data_sources(parent=parent):
print(f" Name: {data_source.display_name} (ID: {data_source.data_source_id})")
except Exception as e:
print(f"Error listing data sources: {e}
Ensure the BigQuery Data Transfer API is enabled for project '{project_id}' and your credentials have sufficient permissions.")