Petastorm

JSON →
library 0.13.1 ·python
verified May 24, 2026

Petastorm is a Python library that enables single-node or distributed training of machine learning models directly from datasets stored in Parquet format. It provides data access for popular frameworks like TensorFlow, PyTorch, and Apache Spark. The current stable version is 0.13.1, with releases typically following a feature-driven cadence, often including release candidates before stable versions.

total hits 26
actors 9 distinct systems
last hit 20h ago ByteDance
MetaBot
4
ByteDance
3
GPTBot
2
Script
2
YouBot
1
ClaudeBot
1
Search engines
1
Humans
2

top countries 🇺🇸 United States · 🇫🇷 France · 🇸🇬 Singapore · 🇩🇪 Germany · 🇨🇦 Canada