repartipy: PySpark DataFrame Partition Size Helper

library 0.1.8 ·python

✓ verified May 26, 2026

repartipy is a Python library designed to assist with managing PySpark DataFrame partition sizes. It provides a function to repartition a DataFrame based on a target partition size in megabytes, aiming to optimize storage and processing efficiency. As of version 0.1.8, it's a relatively stable and focused utility, with updates likely driven by PySpark compatibility or feature requests rather than a fixed cadence.

Traffic · last 30 days ↓14% vs prev 7d · indexed Fri Apr 17 · updated Mon Jun 01

total hits 15

actors 6 distinct systems

last hit 2d ago AhrefsBot

GPTBot

MetaBot

Script

ClaudeBot

Search engines

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇩🇪 Germany

Resources

githubgithub.com/sakjung/repartipy ↗

packagepypi.org/project/repartipy/ ↗

API endpoints

full doc /v1/registry/repartipy

install /v1/registry/repartipy/install

compatibility /v1/registry/repartipy/compatibility