ceja

library 0.4.0 ·en maintenance

✓ verified May 27, 2026

ceja is a Python library that provides PySpark implementations of string and phonetic matching algorithms. It enables users to apply functions like NYSIIS, Metaphone, Jaro-Winkler similarity, and Damerau-Levenshtein distance directly within PySpark DataFrames, leveraging Spark's distributed processing capabilities for large datasets. The library is currently at version 0.4.0, with its last release in February 2023, indicating a slow release cadence.

Traffic · last 30 days ↓43% vs prev 7d · indexed Mon Apr 20 · updated Thu Jun 04

total hits 24

actors 7 distinct systems

last hit 5d ago MJ12bot

GPTBot

MetaBot

Script

ChatGPT-User

top countries 🇺🇸 United States · 🇩🇪 Germany · 🇨🇦 Canada · 🇫🇮 Finland · 🇫🇷 France

API endpoints

full doc /v1/registry/ceja

install /v1/registry/ceja/install

compatibility /v1/registry/ceja/compatibility