Morfessor
raw JSON → 2.0.6 verified Mon Apr 27 auth: no python
Morfessor is a tool for unsupervised morphological segmentation of words, often used in NLP and computational linguistics. It supports both morfessor-baseline and morfessor-cat (a cascade model). Current version 2.0.6, with infrequent releases.
pip install morfessor Common errors
error ModuleNotFoundError: No module named 'morfessor.baseline' ↓
cause Trying to import from a submodule that does not exist in version 2.x.
fix
Import from the top-level:
from morfessor import Morfessor. error MorfessorBaseline not defined ↓
cause Using the old class name from version 1.x.
fix
Use
Morfessor instead of MorfessorBaseline. Warnings
breaking Python 3.6+ required starting from version 2.0.0. Python 2 is no longer supported. ↓
fix Upgrade Python to 3.6 or later. Use pip install 'morfessor<2.0.0' if legacy Python is needed.
breaking The API changed significantly in version 2.0.0: the main class is now `Morfessor` instead of `MorfessorBaseline`. Old code using `MorfessorBaseline` will break. ↓
fix Replace `MorfessorBaseline` with `Morfessor` in imports and instantiation.
gotcha Pretrained models are not included. The `Morfessor` class must be trained on data before segmenting. Using without training will produce trivial splits. ↓
fix Train the model using `model.load_data('corpus.txt')` or `model.train_batch()` before calling `segment()`.
Imports
- Morfessor wrong
import morfessorcorrectfrom morfessor import Morfessor - BaselineModel wrong
from morfessor.baseline import BaselineModelcorrectfrom morfessor import BaselineModel
Quickstart
from morfessor import Morfessor
# Initialize and segment a word
model = Morfessor()
result = model.segment('unhappiness')
print(result) # ['un', 'happiness']