Simplemma

raw JSON →
1.1.2 verified Sat May 09 auth: no python

A lightweight multilingual lemmatization and language detection library for Python. Current version: 1.1.2. Release cadence: irregular, with major breaking changes at v1.0.0. Requires Python >=3.8.

pip install simplemma
error AttributeError: module 'simplemma' has no attribute 'langdetect'
cause The 'langdetect' submodule was renamed to 'language_detector' in v1.0.0 and removed in v1.1.2.
fix
Change import to: 'from simplemma.language_detector import LanguageDetector'
error TypeError: lemmatize() got an unexpected keyword argument 'extensive'
cause The 'extensive' argument was renamed to 'greedy' in v1.0.0.
fix
Use 'greedy=True' instead of 'extensive=True'.
error ModuleNotFoundError: No module named 'simplemma.langdetect'
cause The submodule 'langdetect' has been removed as of v1.1.2.
fix
Use 'from simplemma.language_detector import LanguageDetector'.
error ValueError: Wrong language code
cause Language codes must be lowercase ISO 639-1 codes. Unknown or misspelled codes cause this error.
fix
Check your language code against the list of supported languages (e.g., 'en', 'de', 'fr').
breaking The 'extensive' argument was renamed to 'greedy' in v1.0.0. Using 'extensive' will raise an error.
fix Replace 'extensive=True' with 'greedy=True'.
breaking The 'langdetect' submodule has been renamed to 'language_detector' in v1.0.0. Old imports will fail.
fix Use 'from simplemma.language_detector import LanguageDetector'.
deprecated The 'simplemma.langdetect' submodule is removed in v1.1.2. Old code using it will break.
fix Migrate to 'simplemma.language_detector' (v1.0.0+) and ensure you are using v1.1.2+.
gotcha Lemmatization is dictionary-based; unknown words may be returned unchanged. Use 'greedy=True' for more aggressive (rule-based) fallback.
fix Consider enabling greedy mode or pre-checking with 'is_known_word()'.
gotcha Language detection requires that you call 'LanguageDetector()' with the appropriate parameters; the old 'langdetect' function from v0.x is not available.
fix Use 'LanguageDetector().detect(text)' instead of the removed 'simplemma.langdetect'.

Basic usage of simplemma for lemmatizing a word with optional greedy mode.

from simplemma import lemmatize

# Lemmatize a single word
result = lemmatize('running', lang='en')
print(result)  # 'run'

# With greedy mode (more aggressive rule-based lemmatization)
result_greedy = lemmatize('better', lang='en', greedy=True)
print(result_greedy)  # 'good'