MosaicML Streaming

JSON →
library 0.13.0 ·python
verified May 24, 2026

MosaicML Streaming (StreamingDataset) provides PyTorch-compatible datasets that can be efficiently streamed from cloud-based object stores (S3, GCS, Azure Blob Storage, Hugging Face Hub) or local filesystems. It enables training on large datasets without needing to download them entirely beforehand, improving data loading performance and reducing storage costs. The library is actively maintained with frequent updates, currently at version 0.13.0.

total hits 45
actors 12 distinct systems
last hit 1d ago Amazonbot
ByteDance
9
ChatGPT-User
8
Amazonbot
4
MetaBot
4
GPTBot
2
Script
2
ClaudeBot
1
PerplexityBot
1
Search engines
1
Humans
1

top countries 🇺🇸 United States · 🇸🇬 Singapore · 🇩🇪 Germany · 🇫🇷 France · 🇨🇦 Canada