Google Cloud Discovery Engine

0.18.0 · active · verified Thu Apr 09

The Google Cloud Discovery Engine API client library provides Python access to Google Cloud's AI-powered search and recommendations service. It enables developers to build rich, personalized search and browse experiences across various data sources, including websites, structured data, and media. The current version is 0.18.0, and it follows Google Cloud's frequent release cadence, often tied to API updates.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to perform a basic search query using the Discovery Engine SearchServiceClient. It shows how to construct a request with a serving config and iterate through the search results. Remember to set up proper authentication and replace placeholder values with your specific Google Cloud project details. Discovery Engine often requires the project *number* instead of the project *ID* for resource paths.

import os
from google.cloud import discoveryengine_v1beta as discoveryengine

def search_discovery_engine(
    project_id: str,
    location_id: str,
    data_store_id: str,
    query: str,
):
    # Construct the parent resource name for the serving config.
    # Format: projects/{project_number}/locations/{location}/dataStores/{data_store_id}
    # Note: Discovery Engine often expects project *number*, not project *ID*.
    parent = (
        f"projects/{project_id}/locations/{location_id}/dataStores/{data_store_id}"
    )

    client = discoveryengine.SearchServiceClient()

    # The request body for a search operation.
    request = discoveryengine.SearchRequest(
        serving_config=(
            f"{parent}/servingConfigs/default_config" # Or a custom serving config
        ),
        query=query,
        query_expansion_spec=discoveryengine.QueryExpansionSpec(
            condition=discoveryengine.QueryExpansionSpec.Condition.AUTO
        ),
        page_size=5 # Limit to 5 results for brevity
    )

    response = client.search(request=request)

    print(f"\nSearch results for query: '{query}'")
    if not response.results:
        print("No results found.")
        return

    for result in response.results:
        print(f"-" * 20)
        print(f"Document ID: {result.document.id}")
        if result.document.derived_struct:
            # Derived_struct contains parsed content like title, link, snippets.
            title = result.document.derived_struct.get('title')
            link = result.document.derived_struct.get('link')
            snippet = result.document.derived_struct.get('snippets')
            print(f"Title: {title if title else 'N/A'}")
            print(f"Link: {link if link else 'N/A'}")
            if snippet:
                print(f"Snippet: {snippet[0].get('snippet')}")

# Example usage:
if __name__ == "__main__":
    # Replace with your actual Google Cloud Project Number, Location, and Data Store ID.
    # Ensure your service account has 'Discovery Engine User' role.
    project_number = os.environ.get("GCP_PROJECT_NUMBER", "YOUR_PROJECT_NUMBER")
    location = os.environ.get("GCP_LOCATION", "global") # e.g., 'global', 'us-central1'
    data_store_id = os.environ.get("DISCOVERY_ENGINE_DATA_STORE_ID", "YOUR_DATA_STORE_ID")
    search_query = "Google Cloud AI solutions"

    if "YOUR_PROJECT_NUMBER" in project_number or "YOUR_DATA_STORE_ID" in data_store_id:
        print("Please set GCP_PROJECT_NUMBER, GCP_LOCATION, and DISCOVERY_ENGINE_DATA_STORE_ID environment variables ")
        print("or replace the placeholder values in the quickstart code to run it.")
    else:
        search_discovery_engine(project_number, location, data_store_id, search_query)

view raw JSON →