sentencex

raw JSON →
1.0.23 verified Fri May 01 auth: no python

A sentence segmentation library supporting ~300 languages, powered by Rust for performance. Version 1.0.23, MIT license, maintained by Wikimedia. Releases are frequent with multilingual improvements.

pip install sentencex
error TypeError: 'generator' object is not subscriptable
cause segment returns a generator, not a list.
fix
sentences = list(segment('en', text))
error ValueError: Unsupported language code: 'English'
cause Language code must be ISO 639-1 two-letter code, not language name.
fix
Use 'en' instead of 'English'.
gotcha The segment function returns a generator, not a list. Must be consumed (e.g., list()) to see all sentences.
fix Wrap with list() or iterate.
gotcha Language code must be lowercase (e.g., 'en', 'zh') but not all ISO codes are supported. Check support list.
fix Use 'en' for English; see README for supported codes.

Basic usage: segment English text into sentences.

from sentencex import segment

text = "Hello world! This is a test. And another sentence."
sentences = segment('en', text)
print(list(sentences))