Google Cloud Datacatalog Lineage

0.6.0 · active · verified Thu Apr 16

The `google-cloud-datacatalog-lineage` client library allows Python developers to interact with the Google Cloud Datacatalog Lineage API. This API helps track the origin and transformation of data within Google Cloud, providing visibility into data pipelines. The current version is 0.6.0, and it follows Google Cloud's frequent release cadence for client libraries, often aligning with underlying API changes or bug fixes.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the `LineageClient` and list data lineage processes within a specified Google Cloud project and location. It includes basic error handling and emphasizes proper project ID and authentication setup.

import os
from google.cloud import datacatalog_lineage_v1

# Ensure GOOGLE_APPLICATION_CREDENTIALS is set, or running in a GCP environment.
# For local development, set GOOGLE_APPLICATION_CREDENTIALS environment variable:
# export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/keyfile.json"

# Replace with your actual GCP project ID and desired location
project_id = os.environ.get("GCP_PROJECT_ID", "your-gcp-project-id")
location = "us-central1" # e.g., "us-central1", "europe-west1"

try:
    client = datacatalog_lineage_v1.LineageClient()
    parent = f"projects/{project_id}/locations/{location}"

    print(f"Listing processes in {parent}...")
    # The list_processes method returns an iterable
    for process in client.list_processes(parent=parent):
        print(f"Found Process: {process.name}")

    print("Successfully listed processes (or completed iteration if none found).")

except Exception as e:
    print(f"An error occurred: {e}")
    print("Ensure 'GCP_PROJECT_ID' environment variable is set or replace placeholder.")
    print("Ensure `google-cloud-datacatalog-lineage` is installed and authenticated.")

view raw JSON →