DVC HTTP/HTTPS Remote Plugin
dvc-http is a plugin for Data Version Control (DVC) that provides support for HTTP and HTTPS remotes, allowing DVC to store and retrieve data from web servers. It leverages the `fsspec` library to provide filesystem-like access over HTTP(S). The current version is 2.32.0, and the project maintains an active release cadence with frequent updates.
Warnings
- breaking Timeout parameters for HTTP operations have changed. `sock_read_timeout` and `sock_connect_timeout` are no longer supported.
- gotcha dvc-http no longer has `dvc` as a direct runtime dependency, nor does it import from `dvc` internally.
- gotcha While `dvc-http`'s `fsspec[http]` dependency is handled automatically by pip, ensure `dvc-http` (or `fsspec[http]`) is correctly installed if you encounter 'protocol not found' errors when trying to use `fsspec.filesystem('http')`.
Install
-
pip install dvc-http
Imports
- fsspec.filesystem('http')
import fsspec fs = fsspec.filesystem('http')
Quickstart
import fsspec
import os
# dvc-http registers itself with fsspec to handle 'http' and 'https' protocols.
# No direct import from dvc_http is typically needed for basic usage.
try:
# This will use dvc-http's implementation if installed
fs = fsspec.filesystem("http")
# Using a placeholder public URL for demonstration
file_url = "http://www.textfiles.com/100/abacus.txt"
print(f"Attempting to read from: {file_url}")
with fs.open(file_url, "r", encoding="utf-8") as f:
content = f.read(100) # Read first 100 characters
print(f"\nSuccessfully read from {file_url}:")
print("--- Content Snippet ---")
print(content)
print("-----------------------")
except Exception as e:
print(f"\nAn error occurred: {e}")
print("Ensure dvc-http is installed and the URL is accessible.")