Google Cloud Video Intelligence

raw JSON →
2.19.0 verified Tue May 12 auth: no python install: verified quickstart: stale

The Google Cloud Video Intelligence API Python client library (current version 2.19.0) enables developers to analyze video content by detecting objects, scenes, activities, and transcribing speech. It provides capabilities to extract metadata, such as labels, shot changes, explicit content, and more, from videos stored in Google Cloud Storage or provided as data bytes. The library is actively maintained with frequent updates as part of the larger `google-cloud-python` ecosystem.

pip install google-cloud-videointelligence
error google.api_core.exceptions.RetryError: Timeout of 600.0s exceeded, last exception: 504 Deadline Exceeded
cause This error occurs when the video processing time exceeds the default or configured timeout limit, often with longer videos or complex analysis features.
fix
For longer videos, upload the video to Google Cloud Storage and use input_uri instead of input_content. If using input_uri, increase the timeout parameter in the operation.result() call or split the video into smaller segments.
error PERMISSION_DENIED: The caller does not have permission
cause This error indicates that the Google Cloud service account or user credentials used by your application lack the necessary IAM permissions to access the Video Intelligence API or the Google Cloud Storage bucket containing the video.
fix
Ensure the service account has the 'Cloud Video Intelligence User' role and 'Storage Object Viewer' (or similar read) permissions on the relevant GCS bucket. Also, verify that the Video Intelligence API is enabled in your Google Cloud project.
error Request contains an invalid argument.
cause This often happens when the `input_uri` for the video is in an incorrect format (e.g., `https://` instead of `gs://`) or when `input_content` is used for a video that should be in Cloud Storage.
fix
Ensure that video URIs are in the gs://bucket-id/object-id format for videos in Google Cloud Storage. If passing video bytes directly, use the input_content parameter and ensure input_uri is not set.
error ModuleNotFoundError: No module named 'google.cloud.videointelligence'
cause This error typically occurs when the `google-cloud-videointelligence` library is not installed or the Python environment is not correctly configured to find the installed packages.
fix
Install the library using pip: pip install google-cloud-videointelligence. If already installed, ensure you are running your script within the correct Python virtual environment where the library was installed.
gotcha Authentication is critical. Ensure your environment is correctly authenticated, typically via Application Default Credentials. For local development, `gcloud auth application-default login` is recommended, or explicitly setting the `GOOGLE_APPLICATION_CREDENTIALS` environment variable to a service account key file path.
fix Use `gcloud auth application-default login` or set `os.environ['GOOGLE_APPLICATION_CREDENTIALS']`.
gotcha Most video annotation operations are asynchronous and return a `google.api_core.operation.Operation` object. You must explicitly call `.result()` on this operation object and wait for its completion to retrieve the actual API response. Failure to do so will result in an `Operation` object, not the annotation results.
fix After calling an asynchronous client method (e.g., `client.annotate_video()`), store the returned value as an `operation` and then call `result = operation.result(timeout=...)`.
gotcha The Video Intelligence API has different versions (e.g., `v1`, `v1p1beta1`). Ensure you import the correct version (e.g., `videointelligence_v1`) and use features available in that specific version. Beta features may not be stable or present in the stable API.
fix Always import specific API versions (e.g., `from google.cloud import videointelligence_v1 as videointelligence`) and consult the documentation for the features supported by that version.
gotcha For features like `LABEL_DETECTION` and `SHOT_CHANGE_DETECTION`, you can specify different underlying models (e.g., `builtin/stable`, `builtin/latest`). Google may update or deprecate these models, which could lead to changes in detection results over time if not explicitly pinned or monitored.
fix If consistent results are critical, consider explicitly setting the `model` field in the `LabelDetectionConfig` or `ShotChangeDetectionConfig` to `builtin/stable`. Monitor release notes for model updates and deprecations.
gotcha The library's logging events (when enabled via `GOOGLE_SDK_PYTHON_LOGGING_SCOPE`) may contain sensitive information. Google may also refine the occurrence, level, and content of log messages without flagging such changes as breaking. Do not depend on the immutability of logging events or store sensitive data in logs without proper access restrictions.
fix Restrict access to stored logs. Do not rely on specific log message formats or contents for application logic.
python os / libc status wheel install import disk
3.10 alpine (musl) - - 1.68s 68.7M
3.10 slim (glibc) - - 1.05s 66M
3.11 alpine (musl) - - 2.45s 73.3M
3.11 slim (glibc) - - 1.51s 71M
3.12 alpine (musl) - - 2.40s 64.8M
3.12 slim (glibc) - - 1.92s 63M
3.13 alpine (musl) - - 2.36s 64.4M
3.13 slim (glibc) - - 1.99s 62M
3.9 alpine (musl) - - 1.49s 68.8M
3.9 slim (glibc) - - 1.17s 67M

This quickstart demonstrates how to use the `google-cloud-videointelligence` client library to detect labels within a video stored in Google Cloud Storage. It initializes the client, configures label detection, sends an annotation request, and waits for the long-running operation to complete, then prints the detected labels.

import os
from google.cloud import videointelligence_v1 as videointelligence

# Set GOOGLE_APPLICATION_CREDENTIALS environment variable or ensure gcloud is authenticated.
# For local development, run `gcloud auth application-default login`.
# os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/your/key.json'

def analyze_video_labels(gcs_uri):
    """Detects labels in the video specified by the GCS URI."""
    client = videointelligence.VideoIntelligenceServiceClient()
    features = [videointelligence.Feature.LABEL_DETECTION]

    # Optional: Configure label detection mode for more granular control
    config = videointelligence.LabelDetectionConfig(
        label_detection_mode=videointelligence.LabelDetectionMode.SHOT_AND_FRAME_MODE,
        stationary_camera=False # Set to True if analyzing footage from a stationary camera
    )
    video_context = videointelligence.VideoContext(label_detection_config=config)

    print(f'Processing video for label annotations: {gcs_uri}')
    operation = client.annotate_video(
        request={
            "input_uri": gcs_uri,
            "features": features,
            "video_context": video_context
        }
    )

    # Long-running operations must be waited for.
    print('\nWaiting for operation to complete...')
    result = operation.result(timeout=600)  # Adjust timeout as needed (in seconds)

    print('\nFinished processing.')
    # First result is retrieved because a single video is processed
    annotation_result = result.annotation_results[0]

    for i, shot_label in enumerate(annotation_result.shot_label_annotations):
        print(f'Video shot label: {shot_label.entity.description} ({shot_label.entity.entity_id})')
        for segment in shot_label.segments:
            start_time = (segment.segment.start_time_offset.seconds + 
                          segment.segment.start_time_offset.nanos / 1e9)
            end_time = (segment.segment.end_time_offset.seconds + 
                        segment.segment.end_time_offset.nanos / 1e9)
            print(f'\tSegment: {start_time:.1f}s to {end_time:.1f}s (confidence: {segment.confidence:.2f})')

    for i, frame_label in enumerate(annotation_result.frame_label_annotations):
        print(f'Video frame label: {frame_label.entity.description} ({frame_label.entity.entity_id})')
        for frame in frame_label.frames:
            time_offset = (frame.time_offset.seconds + 
                           frame.time_offset.nanos / 1e9)
            print(f'\tFrame: {time_offset:.1f}s (confidence: {frame.confidence:.2f})')

if __name__ == '__main__':
    # Replace with your GCS video URI
    # Public sample video from Google Cloud documentation
    video_uri = "gs://cloud-samples-data/video/chicago.mp4"
    analyze_video_labels(video_uri)