Google Cloud Discovery Engine
The Google Cloud Discovery Engine API client library provides Python access to Google Cloud's AI-powered search and recommendations service. It enables developers to build rich, personalized search and browse experiences across various data sources, including websites, structured data, and media. The current version is 0.18.0, and it follows Google Cloud's frequent release cadence, often tied to API updates.
Warnings
- gotcha Resource names often require the Google Cloud project *number* instead of the project *ID*. Ensure you retrieve and use the numeric project ID when constructing resource paths like `projects/{project_number}/locations/{location}/dataStores/{data_store_id}`.
- gotcha The `google-cloud-discoveryengine` library currently primarily exposes the `discoveryengine_v1beta` module. Direct imports from `google.cloud.discoveryengine` might lack access to the latest APIs or features, leading to `AttributeError` or missing functionalities.
- gotcha Authentication is crucial. Without proper Google Cloud credentials (e.g., via `gcloud auth application-default login`, `GOOGLE_APPLICATION_CREDENTIALS` environment variable, or service account key), API calls will fail with permission errors.
- gotcha Discovery Engine's search and recommendation features rely on 'Serving Configs'. If you don't specify a valid `serving_config` in your search or recommendation requests, operations will fail or return incomplete results.
Install
-
pip install google-cloud-discoveryengine
Imports
- SearchServiceClient
from google.cloud import discoveryengine_v1beta as discoveryengine client = discoveryengine.SearchServiceClient()
- CompletionServiceClient
from google.cloud import discoveryengine_v1beta as discoveryengine client = discoveryengine.CompletionServiceClient()
- RecommendationServiceClient
from google.cloud import discoveryengine_v1beta as discoveryengine client = discoveryengine.RecommendationServiceClient()
- UserEventServiceClient
from google.cloud import discoveryengine_v1beta as discoveryengine client = discoveryengine.UserEventServiceClient()
Quickstart
import os
from google.cloud import discoveryengine_v1beta as discoveryengine
def search_discovery_engine(
project_id: str,
location_id: str,
data_store_id: str,
query: str,
):
# Construct the parent resource name for the serving config.
# Format: projects/{project_number}/locations/{location}/dataStores/{data_store_id}
# Note: Discovery Engine often expects project *number*, not project *ID*.
parent = (
f"projects/{project_id}/locations/{location_id}/dataStores/{data_store_id}"
)
client = discoveryengine.SearchServiceClient()
# The request body for a search operation.
request = discoveryengine.SearchRequest(
serving_config=(
f"{parent}/servingConfigs/default_config" # Or a custom serving config
),
query=query,
query_expansion_spec=discoveryengine.QueryExpansionSpec(
condition=discoveryengine.QueryExpansionSpec.Condition.AUTO
),
page_size=5 # Limit to 5 results for brevity
)
response = client.search(request=request)
print(f"\nSearch results for query: '{query}'")
if not response.results:
print("No results found.")
return
for result in response.results:
print(f"-" * 20)
print(f"Document ID: {result.document.id}")
if result.document.derived_struct:
# Derived_struct contains parsed content like title, link, snippets.
title = result.document.derived_struct.get('title')
link = result.document.derived_struct.get('link')
snippet = result.document.derived_struct.get('snippets')
print(f"Title: {title if title else 'N/A'}")
print(f"Link: {link if link else 'N/A'}")
if snippet:
print(f"Snippet: {snippet[0].get('snippet')}")
# Example usage:
if __name__ == "__main__":
# Replace with your actual Google Cloud Project Number, Location, and Data Store ID.
# Ensure your service account has 'Discovery Engine User' role.
project_number = os.environ.get("GCP_PROJECT_NUMBER", "YOUR_PROJECT_NUMBER")
location = os.environ.get("GCP_LOCATION", "global") # e.g., 'global', 'us-central1'
data_store_id = os.environ.get("DISCOVERY_ENGINE_DATA_STORE_ID", "YOUR_DATA_STORE_ID")
search_query = "Google Cloud AI solutions"
if "YOUR_PROJECT_NUMBER" in project_number or "YOUR_DATA_STORE_ID" in data_store_id:
print("Please set GCP_PROJECT_NUMBER, GCP_LOCATION, and DISCOVERY_ENGINE_DATA_STORE_ID environment variables ")
print("or replace the placeholder values in the quickstart code to run it.")
else:
search_discovery_engine(project_number, location, data_store_id, search_query)