segtok: Sentence Segmentation and Word Tokenization
JSON →Segtok is a fast, rule-based Python library for sentence segmentation and word tokenization. It is designed for well-orthographed texts, particularly in English, German, and Romance languages, offering high precision and Unicode support. The current version is 1.5.11. While functional, it is largely superseded by 'syntok' (segtok v2) which offers improved performance and handles more edge cases. It is in a maintenance phase with no active development.
Traffic · last 30 days ↑900% vs prev 7d
total hits 14
actors 5 distinct systems
last hit 2d ago AhrefsBot
top countries 🇺🇸 United States · 🇨🇦 Canada · 🇩🇪 Germany
API endpoints
full doc /v1/registry/segtok
install /v1/registry/segtok/install
compatibility /v1/registry/segtok/compatibility