Spark-sklearn: Scikit-learn on Spark

library 0.3.0 ·python abandoned

✓ verified May 27, 2026

spark-sklearn provides integration tools for running scikit-learn's GridSearchCV and RandomizedSearchCV on Apache Spark clusters. It leverages Spark for distributed computation of model training, allowing users to scale hyperparameter tuning. The library is currently at version 0.3.0, with its last release in 2017, and appears to be in an abandoned state with no active development or maintenance.