ceja

JSON →
library 0.4.0 ·en maintenance
verified May 27, 2026

ceja is a Python library that provides PySpark implementations of string and phonetic matching algorithms. It enables users to apply functions like NYSIIS, Metaphone, Jaro-Winkler similarity, and Damerau-Levenshtein distance directly within PySpark DataFrames, leveraging Spark's distributed processing capabilities for large datasets. The library is currently at version 0.4.0, with its last release in February 2023, indicating a slow release cadence.

total hits 24
actors 7 distinct systems
last hit 5d ago MJ12bot
GPTBot
5
MetaBot
4
Script
1
ChatGPT-User
1

top countries 🇺🇸 United States · 🇩🇪 Germany · 🇨🇦 Canada · 🇫🇮 Finland · 🇫🇷 France