GTO (Git Tag Ops)
GTO (Git Tag Ops) is an open-source Python library designed to transform your Git repository into an Artifact Registry, particularly suited for Machine Learning models. It facilitates tracking new artifact versions, managing their lifecycle through defined stages, and automating integrations with CI/CD systems using a Git-native approach. The library is currently at version 1.9.0 and maintains a steady release cadence with frequent minor and patch updates.
Warnings
- breaking GTO dropped support for Python 3.8 in version 1.7.0. Users running GTO on Python 3.8 or older versions will encounter installation or runtime errors and must upgrade their Python environment to 3.9 or newer.
- breaking GTO migrated to Pydantic V2 in version 1.9.0. This introduces significant breaking changes from Pydantic V1, affecting how models are defined, validated, and serialized. If your project has other dependencies pinning Pydantic V1 or directly uses Pydantic V1 APIs, this update will likely cause conflicts or runtime errors.
- breaking GTO updated its internal `scmrepo` dependency to version 3.x in GTO 1.7.0. Major version bumps in `scmrepo` (a library for SCM operations) typically involve breaking API changes. While direct interaction with `scmrepo` might be abstracted by GTO, users who rely on internal `scmrepo` structures or have other `scmrepo`-dependent libraries might face compatibility issues.
- gotcha GTO uses Git tags to store artifact versions and stage assignments. While `gto` commands create these tags locally, they are not automatically pushed to your remote Git repository. Failing to `git push --tags` means your artifact registry changes are not synchronized or visible to collaborators/CI/CD systems.
Install
-
pip install gto
Imports
- gto
import gto
- gto.api
from gto import api
Quickstart
import os
from pathlib import Path
import tempfile
import shutil
from gto import api
# Create a temporary directory for a mock Git repo
original_cwd = os.getcwd()
temp_dir = Path(tempfile.mkdtemp())
os.chdir(temp_dir)
try:
# Initialize a Git repository
os.system('git init -b main')
os.system('git config user.email "test@example.com"')
os.system('git config user.name "Test User"')
# Create a dummy artifact file
(temp_dir / "model.pkl").write_text("dummy_model_content")
os.system('git add .')
os.system('git commit -m "Initial commit with model.pkl"')
# Register a new version of an artifact
print("\n--- Registering artifact 'my-model' ---")
api.register("my-model", "model.pkl", type="model", description="My first model")
print("Artifact registered successfully.\n")
# Show the current state of the artifact registry
print("\n--- Showing artifact registry state ---")
registry_state = api.show(name="my-model", json=True)
print(registry_state)
# Promote the artifact to a 'dev' stage
print("\n--- Promoting 'my-model' to 'dev' stage ---")
api.assign("my-model", stage="dev", version="my-model@v0.0.1")
print("Artifact promoted to 'dev' stage successfully.\n")
# Show the updated state
print("\n--- Showing updated registry state ---")
updated_registry_state = api.show(name="my-model", json=True)
print(updated_registry_state)
# Important: Push the Git tags for changes to be reflected remotely
# In a real scenario: os.system('git push origin --tags')
finally:
os.chdir(original_cwd)
shutil.rmtree(temp_dir)