repartipy: PySpark DataFrame Partition Size Helper

JSON →
library 0.1.8 ·python
verified May 26, 2026

repartipy is a Python library designed to assist with managing PySpark DataFrame partition sizes. It provides a function to repartition a DataFrame based on a target partition size in megabytes, aiming to optimize storage and processing efficiency. As of version 0.1.8, it's a relatively stable and focused utility, with updates likely driven by PySpark compatibility or feature requests rather than a fixed cadence.

total hits 15
actors 6 distinct systems
last hit 2d ago AhrefsBot
GPTBot
5
MetaBot
4
Script
1
ClaudeBot
1
Search engines
1

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇩🇪 Germany