Google Cloud Data Catalog
Google Cloud Data Catalog is a fully managed, highly scalable data discovery and metadata management service. It allows users to discover, manage, and understand data assets across Google Cloud, supporting technical and business metadata. The Python client library, currently at version 3.30.0, provides programmatic access to the Data Catalog API and follows a regular release cadence as part of the broader `google-cloud-python` client libraries.
Warnings
- breaking Google Cloud Data Catalog is deprecated in favor of Dataplex Universal Catalog. While the Data Catalog API and client library still function, new development should leverage Dataplex's comprehensive data management capabilities.
- gotcha The `google-cloud-datacatalog` client library logs RPC events using Python's standard logging, but logs may contain sensitive information and are not propagated to the root logger by default. You must configure logging explicitly.
- gotcha Minimum Python version requirement has increased. Older Python environments (e.g., 3.6 or below) are no longer supported by the latest client library versions.
Install
-
pip install google-cloud-datacatalog
Imports
- DataCatalogClient
from google.cloud.datacatalog_v1 import DataCatalogClient
- types
from google.cloud.datacatalog_v1 import types
Quickstart
import os
from google.cloud.datacatalog_v1 import DataCatalogClient
# Set your Google Cloud project ID (e.g., from GOOGLE_CLOUD_PROJECT_ID env var)
# or specify it directly.
project_id = os.environ.get('GOOGLE_CLOUD_PROJECT', 'your-gcp-project-id')
# Create a client
try:
client = DataCatalogClient()
print(f"Data Catalog client created successfully for project: {project_id}")
# Example: List entry groups (pagination handled automatically)
parent = f"projects/{project_id}/locations/us-central1"
print(f"Listing entry groups in {parent}...")
for entry_group in client.list_entry_groups(parent=parent):
print(f" Entry Group: {entry_group.name}")
print("Quickstart finished. Note: Data Catalog is migrating to Dataplex Universal Catalog.")
except Exception as e:
print(f"An error occurred: {e}")
print("Please ensure the Data Catalog API is enabled and authentication is set up.")
print("e.g., export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json")