PySpark Distribution Explorer

library 0.1.8 ·python maintenance

✓ verified May 26, 2026

PySpark Distribution Explorer (pyspark-dist-explore, current version 0.1.8) is a Python library that enables creating histogram and density plots directly from PySpark DataFrames. It simplifies exploratory data analysis (EDA) for large datasets by leveraging Matplotlib and Pandas to visualize distributions. The project is currently in maintenance mode with infrequent updates.