WebDataset
JSON →WebDataset is a high-performance Python-based I/O system for deep learning and data processing, current version 1.0.2. It implements the PyTorch IterableDataset interface, enabling efficient streaming access to datasets stored in POSIX tar archives. It supports sharding for large datasets and is compatible with PyTorch's DataLoader, facilitating scalable and latency-insensitive data pipelines for various data types including images, audio, and video. The library is actively maintained with frequent releases adding new features and bug fixes.
Traffic · last 30 days ↑350% vs prev 7d
total hits 14
actors 5 distinct systems
last hit 4d ago Googlebot
top countries 🇺🇸 United States · 🇨🇦 Canada · 🇩🇪 Germany · 🇬🇧 United Kingdom
API endpoints
full doc /v1/registry/webdataset
install /v1/registry/webdataset/install
compatibility /v1/registry/webdataset/compatibility