Google Cloud Data Catalog

3.30.0 · active · verified Tue Mar 31

Google Cloud Data Catalog is a fully managed, highly scalable data discovery and metadata management service. It allows users to discover, manage, and understand data assets across Google Cloud, supporting technical and business metadata. The Python client library, currently at version 3.30.0, provides programmatic access to the Data Catalog API and follows a regular release cadence as part of the broader `google-cloud-python` client libraries.

Warnings

Install

Imports

Quickstart

Initializes the DataCatalogClient and attempts to list existing entry groups within a specified Google Cloud project and location. This example assumes default authentication (e.g., via `GOOGLE_APPLICATION_CREDENTIALS` environment variable or Google Cloud SDK).

import os
from google.cloud.datacatalog_v1 import DataCatalogClient

# Set your Google Cloud project ID (e.g., from GOOGLE_CLOUD_PROJECT_ID env var)
# or specify it directly.
project_id = os.environ.get('GOOGLE_CLOUD_PROJECT', 'your-gcp-project-id')

# Create a client
try:
    client = DataCatalogClient()
    print(f"Data Catalog client created successfully for project: {project_id}")

    # Example: List entry groups (pagination handled automatically)
    parent = f"projects/{project_id}/locations/us-central1"
    print(f"Listing entry groups in {parent}...")
    for entry_group in client.list_entry_groups(parent=parent):
        print(f"  Entry Group: {entry_group.name}")

    print("Quickstart finished. Note: Data Catalog is migrating to Dataplex Universal Catalog.")
except Exception as e:
    print(f"An error occurred: {e}")
    print("Please ensure the Data Catalog API is enabled and authentication is set up.")
    print("e.g., export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json")

view raw JSON →