Google Cloud BigQuery BigLake API Client Library
The `google-cloud-bigquery-biglake` library is the Python client for the Google Cloud BigLake API. BigLake serves as a unified storage engine, simplifying data access for data warehouses and data lakes. It offers uniform, fine-grained access control across multi-cloud storage solutions like Google Cloud Storage, Amazon S3, and Azure Data Lake Storage, and supports querying open-source table formats such as Apache Iceberg in BigQuery. This library is part of the larger `google-cloud-python` monorepo, suggesting a consistent release and maintenance cadence aligned with other Google Cloud client libraries.
Common errors
-
ModuleNotFoundError: No module named 'google.cloud.bigquery_biglake'
cause The `google-cloud-bigquery-biglake` library is not installed or the Python environment where the code is being run does not have access to the installed package.fixInstall the library using pip: `pip install google-cloud-bigquery-biglake` or ensure your virtual environment is activated if applicable. -
Access Denied: Connection <connection_id>: User does not have bigquery.connections.delegate permission for connection <connection_id>
cause The user or service account attempting to create or interact with a BigLake table lacks the necessary `bigquery.connections.delegate` IAM permission on the specified connection resource.fixGrant the `roles/bigquery.connectionAdmin` role (or a custom role with `bigquery.connections.delegate` permission) to the user or service account that is trying to use the BigLake connection. Ensure the permission is granted to the correct principal (e.g., a user group or a service account directly). -
An internal error occurred and the request could not be completed.
cause This is a general BigQuery internal error, often transient, indicating that the service encountered an unexpected issue during job execution.fixImplement a retry mechanism with exponential back-off for the operation. If the error persists after retries, it might indicate a more persistent issue, and contacting Google Cloud Support with the job ID and detailed error messages is recommended. -
Error while creating a BigLake table: BigLake managed tables are not supported.
cause This error occurs when attempting an operation with BigLake managed tables in a region or configuration where they are not supported, or when the syntax used is for an unsupported feature.fixVerify the documentation for BigLake managed table support in your specific region and for the intended use case. Consider using BigLake external tables or re-evaluating the table creation strategy if managed tables are not available for your requirements.
Warnings
- breaking The library is currently at version 0.7.0. While API stability is generally high for Google Cloud client libraries, pre-1.0.0 versions may introduce breaking changes in minor releases. Always review release notes when upgrading to new minor versions (e.g., 0.7.x to 0.8.x).
- gotcha The BigLake API must be explicitly enabled in your Google Cloud project before you can use this client library to interact with BigLake resources.
- gotcha Proper authentication is required. For local development, Application Default Credentials (ADC) are recommended, often set up using `gcloud auth application-default login` or by setting the `GOOGLE_APPLICATION_CREDENTIALS` environment variable to a service account key file.
- gotcha The library's logging is not configured by default and may contain sensitive information.
Install
-
pip install google-cloud-bigquery-biglake
Imports
- BigLakeServiceClient
from google.cloud.bigquery_biglake_v1 import BigLakeServiceClient
Quickstart
import os
from google.cloud.bigquery_biglake_v1 import BigLakeServiceClient
# Set your Google Cloud Project ID and a location (e.g., 'us-central1')
project_id = os.environ.get('GOOGLE_CLOUD_PROJECT', 'your-gcp-project-id')
location = os.environ.get('GOOGLE_CLOUD_LOCATION', 'us-central1')
# Ensure you have authenticated to Google Cloud. For local development, use:
# gcloud auth application-default login
def list_biglake_catalogs(project_id: str, location: str):
"""Lists BigLake catalogs in a given project and location."""
client = BigLakeServiceClient()
parent = f"projects/{project_id}/locations/{location}"
print(f"Listing BigLake catalogs in {parent}:")
try:
# The list_catalogs method returns an iterable of Catalog objects
for catalog in client.list_catalogs(parent=parent):
print(f" Catalog: {catalog.name}")
except Exception as e:
print(f"An error occurred: {e}")
print("Ensure the BigLake API is enabled for your project and the service account has 'biglake.catalogs.list' permission.")
if __name__ == "__main__":
list_biglake_catalogs(project_id, location)