Google Cloud Dataplex
raw JSON → 2.17.0 verified Tue May 12 auth: no python install: verified
Google Cloud Dataplex is a unified data governance platform that provides an intelligent data fabric to centrally manage, monitor, and govern data across data lakes, data warehouses, and data marts. It enables consistent controls, trusted data access, and powers analytics at scale. The Python client library is currently at version 2.17.0 and is actively maintained with frequent releases.
pip install google-cloud-dataplex Common errors
error Permission denied ↓
cause The service account or user account performing the operation lacks the necessary IAM permissions for the requested Dataplex resource or action.
fix
Grant the appropriate IAM roles (e.g.,
roles/dataplex.admin, roles/dataplex.viewer, roles/dataplex.editor, or specific granular permissions like dataplex.metadataFeeds.create, bigquery.dataOwner) to the service account or user on the relevant Google Cloud project, lake, zone, or resource via the Google Cloud Console or gcloud CLI. error API not enabled or Service unavailable ↓
cause The Google Cloud Dataplex API, or a related required API such as the Data Lineage API, has not been enabled in your Google Cloud project.
fix
Enable the required API(s) through the Google Cloud Console by navigating to 'APIs & Services' > 'Library', searching for the specific API (e.g., 'Dataplex API'), and clicking 'Enable'. Allow a few minutes for the API to fully activate before retrying.
error Resource 'projects/<project_id>/locations/<location>/lakes/<lake_name>' has nested resources. If the API supports cascading delete, set 'force' to true to delete it and its nested resources. ↓
cause You are attempting to delete a Dataplex lake that still contains dependent nested resources (such as zones or assets) without specifying a cascading delete operation.
fix
Either manually delete all nested resources (zones, assets) within the lake before attempting to delete the lake, or, if using the Python client library, set the
force=True parameter in the DeleteLakeRequest to perform a cascading deletion. error google.api_core.exceptions.MethodNotImplemented: 501 Received http2 header with status: 404 ↓
cause This error typically indicates that the `DataplexServiceClient` is attempting to access Dataplex resources in an incorrect, unsupported, or misconfigured Google Cloud region.
fix
Ensure the
DataplexServiceClient is initialized with the correct region where your Dataplex resources are located. You can specify the endpoint by setting client_options={'api_endpoint': 'dataplex.<REGION>.googleapis.com'} during client initialization. Also, verify the project ID is correct and the service account has permissions in that specific region. Warnings
breaking Some metadata stored in Dataplex Universal Catalog changed on January 12, 2026, to align with original source systems (e.g., Vertex AI, Bigtable, Spanner). Workloads that depend on the specific structure or content of this metadata will need to be adjusted to preserve continuity. ↓
fix Review release notes and documentation for specific metadata changes and update your code to reflect the new structure or consistency with source systems.
gotcha Dataplex enforces strict location constraints for resources. Zones (regional or multi-regional) and their associated assets (e.g., GCS buckets, BigQuery datasets) must strictly match the zone's location type. Attempting to add an asset that violates these constraints (e.g., a 'EU' multi-region BigQuery dataset to a 'europe-west1' regional zone) will result in asset attachment failures. ↓
fix Ensure that the location of your Dataplex zones and the underlying data assets (GCS buckets, BigQuery datasets) are compatible and correctly aligned according to Dataplex's strict location hierarchy rules.
deprecated Dataplex Explore was deprecated on July 22, 2024. Functionality provided by Dataplex Explore is now expected to be handled by BigQuery Studio. ↓
fix Migrate any existing Dataplex Explore workloads or functionalities to BigQuery Studio as per the official migration instructions.
gotcha When programmatically querying Dataplex Catalog Entries using the Python client, you might only retrieve custom Aspect *names* but not their corresponding *values* by default. ↓
fix To retrieve the full Aspect values, ensure you set the `view` parameter (e.g., `EntryView.FULL` or `EntryView.ALL`) when calling methods like `CatalogServiceClient.get_entry` or `CatalogServiceClient.list_entries`.
gotcha Running `google-cloud-dataplex_v1` on Python 3.9 or older versions will trigger `FutureWarning` messages because these Python versions are unsupported by the library and its core dependencies. Google will not post any further updates for these older Python versions, potentially leading to critical bug fixes or features being missed. ↓
fix Upgrade your Python environment to version 3.10 or higher, and then ensure `google-cloud-dataplex_v1` and its dependencies (e.g., `google-api-core`, `google-auth`) are updated to their latest compatible versions.
gotcha The Dataplex client library requires a Google Cloud Project ID for most operations. If not explicitly provided in the client configuration, it defaults to checking the 'GOOGLE_CLOUD_PROJECT' environment variable or the project associated with the default credentials. Failure to provide a project ID will prevent successful API calls. ↓
fix Ensure the 'GOOGLE_CLOUD_PROJECT' environment variable is set, or provide the project ID explicitly when initializing Dataplex client objects (e.g., `client = dataplex.DataplexServiceClient(project='your-gcp-project-id')`).
Install compatibility verified last tested: 2026-05-12
python os / libc status wheel install import disk
3.10 alpine (musl) wheel - 3.23s 78.5M
3.10 alpine (musl) - - 2.97s 77.4M
3.10 slim (glibc) wheel 6.3s 1.29s 76M
3.10 slim (glibc) - - 1.37s 75M
3.11 alpine (musl) wheel - 3.69s 84.7M
3.11 alpine (musl) - - 4.01s 83.6M
3.11 slim (glibc) wheel 5.6s 2.26s 82M
3.11 slim (glibc) - - 2.00s 81M
3.12 alpine (musl) wheel - 3.58s 76.0M
3.12 alpine (musl) - - 3.68s 74.9M
3.12 slim (glibc) wheel 4.7s 2.32s 74M
3.12 slim (glibc) - - 2.50s 73M
3.13 alpine (musl) wheel - 3.17s 75.5M
3.13 alpine (musl) - - 3.50s 74.2M
3.13 slim (glibc) wheel 4.8s 2.25s 73M
3.13 slim (glibc) - - 2.41s 72M
3.9 alpine (musl) wheel - 2.82s 78.8M
3.9 alpine (musl) - - 2.98s 77.7M
3.9 slim (glibc) wheel 7.3s 1.59s 77M
3.9 slim (glibc) - - 1.41s 75M
Imports
- DataplexServiceClient
from google.cloud import dataplex_v1
Quickstart last tested: 2026-04-24
import os
from google.cloud import dataplex_v1
def list_lakes(project_id: str, location: str):
"""Lists Dataplex lakes in a given project and location."""
try:
client = dataplex_v1.DataplexServiceClient()
parent = f"projects/{project_id}/locations/{location}"
print(f"Listing lakes in {parent}:")
# API calls often return an iterable (pager) for list methods
for lake in client.list_lakes(parent=parent):
print(f"- {lake.name} (State: {lake.state.name})")
print("Lakes listed successfully.")
except Exception as e:
print(f"An error occurred: {e}")
print("Ensure 'gcloud auth application-default login' has been run or GOOGLE_APPLICATION_CREDENTIALS is set.")
print("Also, verify that the Dataplex API is enabled for your project and the service account has necessary permissions.")
if __name__ == "__main__":
PROJECT_ID = os.environ.get("GOOGLE_CLOUD_PROJECT", "your-gcp-project-id")
LOCATION = "us-central1" # Or your desired region, e.g., "global" for some resources
if PROJECT_ID == "your-gcp-project-id":
print("Please set the 'GOOGLE_CLOUD_PROJECT' environment variable or replace 'your-gcp-project-id' with your actual GCP project ID.")
else:
list_lakes(PROJECT_ID, LOCATION)