Run:AI Model Streamer for GCS
The Run:AI Model Streamer for GCS is a Python library that allows efficient loading and streaming of machine learning models directly from Google Cloud Storage without needing to fully download the entire model to local disk. It's part of the broader Run:AI model-streamer project, offering an abstraction for various object stores. The current version is 0.15.8, and it follows an infrequent release cadence, typically tied to the core `model-streamer-base` package.
Warnings
- gotcha GCP Authentication is required for accessing Google Cloud Storage. Ensure your environment variables (`GOOGLE_APPLICATION_CREDENTIALS`) or default GCP project authentication is correctly configured. Without proper authentication, all GCS operations will fail with permission errors.
- gotcha The `runai-model-streamer-gcs` package strictly depends on the `model-streamer-base` package with the same major and minor version (e.g., `~=0.15.8`). When upgrading, ensure both packages are updated simultaneously to avoid compatibility issues.
- gotcha The library supports two main patterns: `GCSStreamer.open` for streaming as a file-like object and `GCSStreamer.download_file` for downloading to a local path. Choose the method appropriate for your ML framework, as some require a local file path (e.g., certain Hugging Face model loaders) while others can load from file-like objects (e.g., PyTorch's `torch.load`).
- deprecated The `runai-model-streamer-gcs` package officially requires Python versions between 3.8 and 3.11 (inclusive). While it might function with newer Python versions, compatibility is not guaranteed, and issues may arise. PyPI metadata explicitly states `requires_python: >=3.8,<3.12`.
Install
-
pip install runai-model-streamer-gcs
Imports
- GCSStreamer
from runai.model_streamer.gcs import GCSStreamer
Quickstart
import os
from runai.model_streamer.gcs import GCSStreamer
# --- Setup GCS Path (replace with your actual path) ---
# For a real scenario, ensure your environment is authenticated with GCP.
# E.g., by setting GOOGLE_APPLICATION_CREDENTIALS or using `gcloud auth login`.
# This example uses a placeholder and will not work without a valid GCS path and auth.
gcs_model_path = os.environ.get(
'GCS_MODEL_PATH',
'gs://your-bucket-name/path/to/your/model.pt' # REPLACE THIS
)
# --- Example 1: Stream a file-like object ---
# Useful for frameworks that can load directly from file-like objects (e.g., PyTorch torch.load)
try:
if not gcs_model_path.startswith('gs://your-bucket-name'): # Check if placeholder is still there
print(f"Attempting to stream from: {gcs_model_path}")
with GCSStreamer.open(gcs_model_path, 'rb') as f:
# In a real scenario, you'd load your model here, e.g., model = torch.load(f)
print(f"Successfully opened GCS object for streaming: {gcs_model_path}")
else:
print("Skipping streaming example: GCS_MODEL_PATH not set or is placeholder.\nSet GCS_MODEL_PATH environment variable or modify the script.")
except Exception as e:
print(f"Error during streaming: {e}")
# --- Example 2: Download to a local temporary file ---
# Useful for frameworks that require a local file path (e.g., Hugging Face, TensorFlow)
local_temp_path = '/tmp/my_streamed_model.tmp'
try:
if not gcs_model_path.startswith('gs://your-bucket-name'): # Check if placeholder is still there
print(f"Attempting to download from: {gcs_model_path} to {local_temp_path}")
GCSStreamer.download_file(gcs_model_path, local_temp_path)
# In a real scenario, you'd load your model here, e.g., model = MyFramework.load(local_temp_path)
print(f"Successfully downloaded GCS object to: {local_temp_path}")
else:
print("Skipping download example: GCS_MODEL_PATH not set or is placeholder.\nSet GCS_MODEL_PATH environment variable or modify the script.")
except Exception as e:
print(f"Error during download: {e}")
finally:
if os.path.exists(local_temp_path):
os.remove(local_temp_path) # Clean up temporary file
print(f"Cleaned up temporary file: {local_temp_path}")