Google Cloud Dataproc Metastore API client library

1.21.0 · active · verified Sun Mar 29

Google Cloud Dataproc Metastore is a fully managed, highly available, autohealing, and serverless Apache Hive metastore (HMS) that runs on Google Cloud. It simplifies technical metadata management for data lakes and provides interoperability between various data processing engines like Apache Hive, Apache Spark, and Presto. The `google-cloud-dataproc-metastore` Python client library allows developers to programmatically interact with this service. This library is part of the broader `google-cloud-python` monorepo, which typically sees frequent releases, often weekly for various client libraries.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the Dataproc Metastore client and list existing Metastore services within a specified Google Cloud project and location. Ensure your environment is authenticated (e.g., via `gcloud auth application-default login`) and the Dataproc Metastore API is enabled for your project.

import os
from google.cloud.metastore_v1.services.dataproc_metastore import DataprocMetastoreClient
from google.cloud.metastore_v1.types import ListServicesRequest

def list_metastore_services(project_id: str, location: str) -> None:
    """Lists Dataproc Metastore services in a given project and location.

    Args:
        project_id: Your Google Cloud project ID.
        location: The Google Cloud location (e.g., 'us-central1').
    """
    # Instantiates a client
    client = DataprocMetastoreClient()

    # The resource name of the location where the services are located.
    # Example: "projects/my-project/locations/us-central1"
    parent = f"projects/{project_id}/locations/{location}"

    # Construct the request
    request = ListServicesRequest(parent=parent)

    # Call the API
    try:
        page_result = client.list_services(request=request)

        print(f"Dataproc Metastore services in {parent}:")
        found_services = False
        for service in page_result:
            print(f"- {service.name} (State: {service.state.name})")
            found_services = True
        if not found_services:
            print("  No Dataproc Metastore services found.")
    except Exception as e:
        print(f"Error listing services: {e}")
        print("Ensure the API is enabled, credentials are set, and the location is valid.")

# To run this quickstart:
# 1. Ensure `gcloud auth application-default login` has been run or `GOOGLE_APPLICATION_CREDENTIALS` is set.
# 2. Set the `GOOGLE_CLOUD_PROJECT` environment variable to your project ID.
# 3. Set the `GOOGLE_CLOUD_LOCATION` environment variable to your desired location (e.g., "us-central1").
# Example usage:
# GOOGLE_CLOUD_PROJECT='your-project-id' GOOGLE_CLOUD_LOCATION='us-central1' python your_script_name.py
if __name__ == "__main__":
    project_id = os.environ.get("GOOGLE_CLOUD_PROJECT", "")
    location = os.environ.get("GOOGLE_CLOUD_LOCATION", "")

    if not project_id:
        print("Please set the GOOGLE_CLOUD_PROJECT environment variable.")
    elif not location:
        print("Please set the GOOGLE_CLOUD_LOCATION environment variable.")
    else:
        list_metastore_services(project_id, location)

view raw JSON →