dvc-objects library

5.2.0 · active · verified Sat Apr 11

dvc-objects provides filesystem and object-database level abstractions, serving as a core component for DVC (Data Version Control) and related data management tools. It handles operations like hashing files, managing object storage, and providing an fsspec-compatible interface for various backends. The library is actively maintained with frequent minor releases.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to hash a local file using `dvc_objects.file.hash_file`, a fundamental operation within the library for content-addressable storage.

import os
import tempfile
from pathlib import Path
from dvc_objects.file import hash_file
from dvc_objects.hash_info import HashInfo

# Create a temporary file for demonstration
with tempfile.TemporaryDirectory() as tmpdir:
    test_file_path = Path(tmpdir) / "my_data.txt"
    test_file_path.write_text("This is some sample data for hashing.")

    print(f"Hashing file: {test_file_path}")
    # Hash the file using the default algorithm (e.g., MD5)
    hash_info: HashInfo = hash_file(test_file_path)

    print(f"\nFile path: {test_file_path}")
    print(f"Hash Type: {hash_info.name}")
    print(f"Hash Value: {hash_info.value}")

    # Example: Verify content based on hash
    if hash_info.name == "md5" and hash_info.value == "2efb72b834458f4a7c1b52b36e355745":
        print("Hash matches expected value!")
    else:
        print("Hash does not match expected value or is not MD5.")

view raw JSON →