Google Cloud AutoML Client Library

raw JSON →
2.18.1 verified Tue May 12 auth: no python install: verified quickstart: verified

The `google-cloud-automl` Python client library provides programmatic access to the Google Cloud AutoML API, enabling developers to train high-quality machine learning models tailored to specific business needs without extensive ML expertise. It leverages Google's state-of-the-art transfer learning and Neural Architecture Search technology. Currently at version 2.18.1, the library follows a frequent release cadence, typical of Google Cloud client libraries, with new versions often published weekly or bi-weekly. While functional, Google strongly recommends migrating to the Vertex AI SDK (`google-cloud-aiplatform`) for new development, as Vertex AI represents the next generation of Google's AI platform with enhanced features and MLOps capabilities.

pip install google-cloud-automl
error ModuleNotFoundError: No module named 'google.cloud.automl_v1beta1'
cause This error occurs when the `google-cloud-automl` Python client library, or a required sub-module like `automl_v1beta1`, is not installed or accessible in your current Python environment.
fix
Install or upgrade the library using pip: pip install --upgrade google-cloud-automl. Ensure you are running this command in the correct virtual environment if you are using one.
error Request had invalid authentication credentials.
cause This indicates an issue with authentication to Google Cloud, such as missing or expired credentials, incorrect service account setup, or insufficient IAM permissions for the AutoML API.
fix
Ensure the GOOGLE_APPLICATION_CREDENTIALS environment variable is set to the path of a valid service account key JSON file, and that the associated service account has the AutoML Editor IAM role or equivalent permissions.
error The AutoML API is deprecated. Planned removal date is September 30, 2025.
cause This is a deprecation warning indicating that the Google Cloud AutoML API and its client library are being phased out in favor of Google Cloud's next-generation AI platform, Vertex AI.
fix
For new development, migrate to the Vertex AI SDK (google-cloud-aiplatform) to leverage enhanced features and MLOps capabilities, as recommended by Google.
error ValueError: Protocol message Dataset has no "tables_dataset_metadata" field.
cause This error typically arises from an incompatibility between the installed `google-cloud-automl` library version and the parameters used for dataset creation, or from using an outdated library version.
fix
Upgrade the google-cloud-automl library to the latest version: pip install --upgrade google-cloud-automl. This ensures you have the most up-to-date API definitions and features.
deprecated Google strongly recommends migrating to the Vertex AI SDK for Python (`google-cloud-aiplatform`) for new machine learning development. The `google-cloud-automl` library primarily targets an older generation of AutoML tools. While this library remains functional, future feature development, enhanced capabilities, and the best user experience are focused within Vertex AI.
fix Install `google-cloud-aiplatform` (`pip install google-cloud-aiplatform`) and refer to the Vertex AI SDK documentation for modern AutoML and ML Platform functionalities.
breaking This library requires Python >= 3.7. Versions prior to 3.7 are not supported. Specifically, the last version compatible with Python 3.6 is `google-cloud-automl==2.7.3`, and Python 2.7 support ended with `google-cloud-automl==1.0.1`.
fix Upgrade your Python environment to 3.7 or newer. If you must use Python 3.6, explicitly pin your dependency to `google-cloud-automl==2.7.3`.
gotcha Proper authentication and IAM permissions are crucial for interacting with the AutoML API. Common issues include the AutoML API not being enabled in the Google Cloud project, an incorrect service account lacking necessary roles (e.g., `roles/automl.editor`), or the `GOOGLE_APPLICATION_CREDENTIALS` environment variable not being correctly set to point to your service account key file. This is especially prone to error in non-standard execution environments like Jupyter notebooks.
fix 1. Ensure the Google Cloud AutoML API is enabled in your project. 2. Create a Google Cloud service account with at least the `AutoML Editor` role. 3. Download the JSON key file for the service account. 4. Set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable to the absolute path of this JSON key file (e.g., `export GOOGLE_APPLICATION_CREDENTIALS=/path/to/your-service-account-key.json`). 5. Set `GOOGLE_CLOUD_PROJECT` to your project ID.
python os / libc status wheel install import disk
3.10 alpine (musl) wheel - 1.84s 71.9M
3.10 alpine (musl) - - 1.76s 70.7M
3.10 slim (glibc) wheel 6.1s 1.04s 70M
3.10 slim (glibc) - - 1.02s 68M
3.11 alpine (musl) wheel - 2.47s 76.9M
3.11 alpine (musl) - - 2.72s 75.7M
3.11 slim (glibc) wheel 5.2s 1.65s 75M
3.11 slim (glibc) - - 1.61s 73M
3.12 alpine (musl) wheel - 2.43s 68.3M
3.12 alpine (musl) - - 2.67s 67.1M
3.12 slim (glibc) wheel 4.4s 1.90s 66M
3.12 slim (glibc) - - 2.03s 65M
3.13 alpine (musl) wheel - 2.28s 67.9M
3.13 alpine (musl) - - 2.70s 66.7M
3.13 slim (glibc) wheel 4.7s 1.81s 66M
3.13 slim (glibc) - - 2.18s 64M
3.9 alpine (musl) wheel - 1.70s 71.9M
3.9 alpine (musl) - - 1.60s 70.9M
3.9 slim (glibc) wheel 7.1s 1.29s 70M
3.9 slim (glibc) - - 1.12s 69M

This quickstart demonstrates how to list existing AutoML datasets in a specified Google Cloud project and location. Ensure your `GOOGLE_CLOUD_PROJECT` environment variable is set to your project ID, and `GOOGLE_CLOUD_LOCATION` (defaulting to `us-central1`) is set to the desired region. You must also have the AutoML API enabled and appropriate IAM permissions (e.g., `roles/automl.viewer` or `roles/automl.editor`) for your authenticated service account.

import os
from google.cloud import automl_v1
from google.api_core.exceptions import GoogleAPIError

def list_automl_datasets(project_id: str, location: str):
    """Lists all AutoML datasets for a given project and location."""
    try:
        client = automl_v1.AutoMlClient()
        project_location = f"projects/{project_id}/locations/{location}"
        
        print(f"Listing datasets for project '{project_id}' in location '{location}'...")
        datasets = client.list_datasets(parent=project_location)
        
        found_datasets = False
        for dataset in datasets:
            found_datasets = True
            print(f"- Dataset name: {dataset.display_name} (ID: {dataset.name.split('/')[-1]})")
            print(f"  Full Resource Name: {dataset.name}")
            print(f"  State: {automl_v1.Dataset.State(dataset.example_count_state).name}")
            print(f"  Creation Time: {dataset.create_time.strftime('%Y-%m-%d %H:%M:%S UTC')}")

        if not found_datasets:
            print("No datasets found.")

    except GoogleAPIError as e:
        print(f"An API error occurred: {e}")
    except Exception as e:
        print(f"An unexpected error occurred: {e}")

if __name__ == "__main__":
    project_id = os.environ.get("GOOGLE_CLOUD_PROJECT")
    location = os.environ.get("GOOGLE_CLOUD_LOCATION", "us-central1") # Default location

    if not project_id:
        print("Error: GOOGLE_CLOUD_PROJECT environment variable not set.")
        print("Please set it to your Google Cloud project ID.")
    else:
        list_automl_datasets(project_id, location)