Spark NLP

library 6.4.0 ·python

✓ verified Jun 28, 2026

John Snow Labs Spark NLP is a natural language processing library built on top of Apache Spark ML, providing performant and accurate NLP annotations for machine learning pipelines that scale in a distributed environment. It is currently at version 6.4.0 and releases new versions frequently, often multiple times a month, with a strong focus on LLM integration, multimodal document processing, and pipeline robustness.

Traffic · last 30 days ↑0% vs prev 7d · indexed Sun Apr 12 · updated Sat Jul 11

total hits 19

actors 5 distinct systems

last hit 2d ago PetalBot

GPTBot

Script

ByteDance

Search engines

Humans

top countries 🇸🇬 Singapore · 🇺🇸 United States · 🇩🇪 Germany · 🇨🇦 Canada · 🇫🇷 France

Resources

githubgithub.com/JohnSnowLabs/spark-nlp ↗

packagepypi.org/project/spark-nlp/ ↗

homepagenlp.johnsnowlabs.com ↗

API endpoints

full doc /v1/registry/spark-nlp

install /v1/registry/spark-nlp/install

compatibility /v1/registry/spark-nlp/compatibility