sentencex
raw JSON → 1.0.23 verified Fri May 01 auth: no python
A sentence segmentation library supporting ~300 languages, powered by Rust for performance. Version 1.0.23, MIT license, maintained by Wikimedia. Releases are frequent with multilingual improvements.
pip install sentencex Common errors
error TypeError: 'generator' object is not subscriptable ↓
cause segment returns a generator, not a list.
fix
sentences = list(segment('en', text))
error ValueError: Unsupported language code: 'English' ↓
cause Language code must be ISO 639-1 two-letter code, not language name.
fix
Use 'en' instead of 'English'.
Warnings
gotcha The segment function returns a generator, not a list. Must be consumed (e.g., list()) to see all sentences. ↓
fix Wrap with list() or iterate.
gotcha Language code must be lowercase (e.g., 'en', 'zh') but not all ISO codes are supported. Check support list. ↓
fix Use 'en' for English; see README for supported codes.
Imports
- segment wrong
import sentencex sentencex.segment()correctfrom sentencex import segment
Quickstart
from sentencex import segment
text = "Hello world! This is a test. And another sentence."
sentences = segment('en', text)
print(list(sentences))