repartipy: PySpark DataFrame Partition Size Helper
JSON →repartipy is a Python library designed to assist with managing PySpark DataFrame partition sizes. It provides a function to repartition a DataFrame based on a target partition size in megabytes, aiming to optimize storage and processing efficiency. As of version 0.1.8, it's a relatively stable and focused utility, with updates likely driven by PySpark compatibility or feature requests rather than a fixed cadence.
Traffic · last 30 days ↓14% vs prev 7d
total hits 15
actors 6 distinct systems
last hit 2d ago AhrefsBot
top countries 🇺🇸 United States · 🇨🇦 Canada · 🇩🇪 Germany
API endpoints
full doc /v1/registry/repartipy
install /v1/registry/repartipy/install
compatibility /v1/registry/repartipy/compatibility