Google Cloud Video Intelligence

2.19.0 · active · verified Sat Mar 28

The Google Cloud Video Intelligence API Python client library (current version 2.19.0) enables developers to analyze video content by detecting objects, scenes, activities, and transcribing speech. It provides capabilities to extract metadata, such as labels, shot changes, explicit content, and more, from videos stored in Google Cloud Storage or provided as data bytes. The library is actively maintained with frequent updates as part of the larger `google-cloud-python` ecosystem.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use the `google-cloud-videointelligence` client library to detect labels within a video stored in Google Cloud Storage. It initializes the client, configures label detection, sends an annotation request, and waits for the long-running operation to complete, then prints the detected labels.

import os
from google.cloud import videointelligence_v1 as videointelligence

# Set GOOGLE_APPLICATION_CREDENTIALS environment variable or ensure gcloud is authenticated.
# For local development, run `gcloud auth application-default login`.
# os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/your/key.json'

def analyze_video_labels(gcs_uri):
    """Detects labels in the video specified by the GCS URI."""
    client = videointelligence.VideoIntelligenceServiceClient()
    features = [videointelligence.Feature.LABEL_DETECTION]

    # Optional: Configure label detection mode for more granular control
    config = videointelligence.LabelDetectionConfig(
        label_detection_mode=videointelligence.LabelDetectionMode.SHOT_AND_FRAME_MODE,
        stationary_camera=False # Set to True if analyzing footage from a stationary camera
    )
    video_context = videointelligence.VideoContext(label_detection_config=config)

    print(f'Processing video for label annotations: {gcs_uri}')
    operation = client.annotate_video(
        request={
            "input_uri": gcs_uri,
            "features": features,
            "video_context": video_context
        }
    )

    # Long-running operations must be waited for.
    print('\nWaiting for operation to complete...')
    result = operation.result(timeout=600)  # Adjust timeout as needed (in seconds)

    print('\nFinished processing.')
    # First result is retrieved because a single video is processed
    annotation_result = result.annotation_results[0]

    for i, shot_label in enumerate(annotation_result.shot_label_annotations):
        print(f'Video shot label: {shot_label.entity.description} ({shot_label.entity.entity_id})')
        for segment in shot_label.segments:
            start_time = (segment.segment.start_time_offset.seconds + 
                          segment.segment.start_time_offset.nanos / 1e9)
            end_time = (segment.segment.end_time_offset.seconds + 
                        segment.segment.end_time_offset.nanos / 1e9)
            print(f'\tSegment: {start_time:.1f}s to {end_time:.1f}s (confidence: {segment.confidence:.2f})')

    for i, frame_label in enumerate(annotation_result.frame_label_annotations):
        print(f'Video frame label: {frame_label.entity.description} ({frame_label.entity.entity_id})')
        for frame in frame_label.frames:
            time_offset = (frame.time_offset.seconds + 
                           frame.time_offset.nanos / 1e9)
            print(f'\tFrame: {time_offset:.1f}s (confidence: {frame.confidence:.2f})')

if __name__ == '__main__':
    # Replace with your GCS video URI
    # Public sample video from Google Cloud documentation
    video_uri = "gs://cloud-samples-data/video/chicago.mp4"
    analyze_video_labels(video_uri)

view raw JSON →