Run:ai Model Streamer

0.15.8 · active · verified Mon Apr 13

The Run:ai Model Streamer is an open-source Python SDK designed to accelerate the loading of large AI models onto accelerators, such as GPUs or TPUs. It achieves this by streaming tensors directly from various storage locations (local, S3, GCS, Azure Blob Storage) to GPU memory, bypassing local disk buffering, and optimizing for the SafeTensors file format. The current version is 0.15.8, with releases occurring somewhat regularly, indicating active development.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `SafetensorsStreamer` to initiate streaming of a model. It creates a dummy `safetensors` file for a runnable example. For actual use, `file_path` should point to your model. When working with cloud storage (S3, GCS, Azure), ensure the respective `runai-model-streamer-*` package is installed and authentication environment variables (e.g., `GOOGLE_APPLICATION_CREDENTIALS` for GCS, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` for S3, `AZURE_CLIENT_ID` for Azure) are correctly configured. The `stream_file` method starts the streaming process, and `get_tensors()` can then be used to retrieve tensors.

import os
from runai_model_streamer import SafetensorsStreamer

# This is a placeholder for a safetensors file. In a real scenario,
# 'model.safetensors' would be a path to your model file.
# For a runnable example, you might create a dummy file or adapt to a real path.
# If streaming from cloud storage, ensure appropriate backend package is installed
# and environment variables for authentication are set (e.g., GOOGLE_APPLICATION_CREDENTIALS for GCS).
# For local testing, ensure 'model.safetensors' exists or is mocked.

# Example of creating a dummy safetensors file for local quickstart demonstration
try:
    from safetensors.torch import save_file
    import torch
    dummy_tensor = {'tensor_key': torch.randn(10, 10)}
    save_file(dummy_tensor, 'model.safetensors')
    file_path = "model.safetensors"

    print(f"Attempting to stream from: {file_path}")

    with SafetensorsStreamer() as streamer:
        streamer.stream_file(file_path)
        print("Successfully started streaming.")
        # In a real scenario, you would then iterate and process tensors:
        # for name, tensor in streamer.get_tensors():
        #    gpu_tensor = tensor.to('cuda:0') # Or another accelerator
        #    print(f"Streamed tensor: {name}, shape: {gpu_tensor.shape}")

    print("Streamer context closed.")
except ImportError:
    print("To run this quickstart with a dummy file, install 'safetensors' and 'torch':")
    print("pip install safetensors torch")
    print("Alternatively, replace 'model.safetensors' with an actual path to your model file.")
except Exception as e:
    print(f"An error occurred during quickstart: {e}")
    print("Ensure the file_path is correct and necessary system libraries (libcurl4, libssl1.1_1) are installed.")
    print("If streaming from cloud storage, verify environment variables for authentication are set.")

view raw JSON →