scmrepo
scmrepo is an SCM wrapper and fsspec filesystem for Git, commonly used within the DVC ecosystem. It provides a unified API for interacting with Git repositories using various backends like pygit2, dulwich, and gitpython, without necessarily requiring a full `git checkout`. The library is actively maintained, with frequent patch and minor releases.
Warnings
- gotcha scmrepo supports multiple Git backends (dulwich, pygit2, gitpython). To use a specific backend, you must install its corresponding Python package (e.g., `pip install dulwich`) alongside `scmrepo`.
- breaking For the `pygit2` backend, the `ls_remotes` method was migrated to `list_heads`, and treating `LsRemoteResult` as a dictionary was deprecated.
- breaking For the `dulwich` backend, `repo stage` was deprecated in favor of `worktree` for some operations, and the minimum `dulwich` version was bumped to `0.24.3`.
- gotcha The `dulwich` backend has seen fixes related to GPG signing for commits across several versions. If you encounter issues with signed commits using `dulwich`, check `scmrepo` versions `3.5.7` and `3.5.8` for relevant fixes.
Install
-
pip install scmrepo
Imports
- GitFileSystem
from scmrepo.fs import GitFileSystem
- Git
from scmrepo.git import Git
Quickstart
import os
import shutil
import tempfile
from scmrepo.git import Git
from scmrepo.fs import GitFileSystem
# Create a temporary directory for the repository
repo_dir = tempfile.mkdtemp()
file_path = os.path.join(repo_dir, "test_file.txt")
try:
# Initialize a Git repository
git = Git(repo_dir)
git.init()
# Create and commit a file
with open(file_path, "w") as f:
f.write("Hello, scmrepo!\n")
git.add(file_path)
git.commit("Initial commit", allow_empty=True)
# Use GitFileSystem to read the file from 'HEAD'
fs = GitFileSystem(repo_dir, rev="HEAD")
with fs.open("test_file.txt", "r") as f:
content = f.read()
print(f"Content of test_file.txt: {content.strip()}")
# Demonstrate walking the file system
print("\nFiles in repo (from GitFileSystem):")
for root, dnames, fnames in fs.walk("/"):
for dname in dnames:
print(fs.path.join(root, dname))
for fname in fnames:
print(fs.path.join(root, fname))
except Exception as e:
print(f"An error occurred: {e}")
finally:
# Clean up the temporary directory
shutil.rmtree(repo_dir)
print(f"Cleaned up temporary repository at {repo_dir}")