WebDataset

JSON →
library 1.0.2 ·python
verified May 21, 2026

WebDataset is a high-performance Python-based I/O system for deep learning and data processing, current version 1.0.2. It implements the PyTorch IterableDataset interface, enabling efficient streaming access to datasets stored in POSIX tar archives. It supports sharding for large datasets and is compatible with PyTorch's DataLoader, facilitating scalable and latency-insensitive data pipelines for various data types including images, audio, and video. The library is actively maintained with frequent releases adding new features and bug fixes.

total hits 14
actors 5 distinct systems
last hit 4d ago Googlebot
GPTBot
6
Script
2
Search engines
2

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇩🇪 Germany · 🇬🇧 United Kingdom