{"id":1675,"library":"python-crfsuite","title":"python-crfsuite","description":"python-crfsuite is a Python binding for CRFsuite, a fast implementation of Conditional Random Fields (CRFs) for labeling sequential data. It's widely used in Natural Language Processing (NLP) for tasks like Named Entity Recognition (NER), Part-of-Speech (POS) tagging, and other sequence labeling problems. The current version is 0.9.12, and releases primarily focus on Python version compatibility and stability.","status":"active","version":"0.9.12","language":"en","source_language":"en","source_url":"https://github.com/scrapinghub/python-crfsuite","tags":["NLP","Machine Learning","CRF","Sequence Labeling","Conditional Random Fields"],"install":[{"cmd":"pip install python-crfsuite","lang":"bash","label":"Install stable version"}],"dependencies":[],"imports":[{"note":"The PyPI package is 'python-crfsuite', but the Python import module is 'pycrfsuite'.","wrong":"from python_crfsuite import Trainer","symbol":"Trainer","correct":"import pycrfsuite\ntrainer = pycrfsuite.Trainer(...)"},{"note":"The PyPI package is 'python-crfsuite', but the Python import module is 'pycrfsuite'.","wrong":"from python_crfsuite import Tagger","symbol":"Tagger","correct":"import pycrfsuite\ntagger = pycrfsuite.Tagger(...)"}],"quickstart":{"code":"import pycrfsuite\nimport os\n\n# Sample data (features, labels)\nX_train = [\n    [['walk', 'big'], ['dog']],\n    [['eat', 'apple'], ['red', 'apple']],\n    [['run', 'fast'], ['cat']]\n]\ny_train = [\n    ['VERB', 'NOUN'],\n    ['VERB', 'NOUN'],\n    ['VERB', 'NOUN']\n]\n\n# 1. Train a CRF model\ntrainer = pycrfsuite.Trainer(verbose=False)\nfor xseq, yseq in zip(X_train, y_train):\n    trainer.append(xseq, yseq)\n\ntrainer.set_params({\n    'c1': 1.0,   # coefficient for L1 penalty\n    'c2': 1e-3,  # coefficient for L2 penalty\n    'max_iterations': 50, # stop earlier\n    'feature.possible_transitions': True\n})\n\nmodel_filename = 'model.crfsuite'\ntrainer.train(model_filename)\n\nprint(f\"Model trained and saved to '{model_filename}'\")\n\n# 2. Use the trained model for tagging\ntagger = pycrfsuite.Tagger()\ntagger.open(model_filename)\n\nX_test = [\n    [['see', 'small'], ['dog']]\n]\n\npredicted_tags = [tagger.tag(xseq) for xseq in X_test]\nprint(f\"Test sequence: {X_test}\")\nprint(f\"Predicted tags: {predicted_tags}\")\n\n# Clean up the model file\nos.remove(model_filename)","lang":"python","description":"This quickstart demonstrates how to train a Conditional Random Field (CRF) model using `pycrfsuite.Trainer` and then use the trained model with `pycrfsuite.Tagger` to predict labels for new sequences. The example uses a simple list-of-lists format for features and labels, which is common for sequence labeling tasks."},"warnings":[{"fix":"Upgrade your Python environment to 3.10 or newer (3.10, 3.11, 3.12, 3.13, 3.14 are supported). If unable to upgrade, pin to `python-crfsuite<0.9.12`.","message":"Version 0.9.12 dropped support for Python 3.6, 3.7, 3.8, and 3.9. Users on these older Python versions must either upgrade their Python environment or pin to an older `python-crfsuite` version.","severity":"breaking","affected_versions":"0.9.12 and later"},{"fix":"Always use `import pycrfsuite` in your Python code, even though you install it with `pip install python-crfsuite`.","message":"The PyPI package name is `python-crfsuite`, but the module to import in your Python code is `pycrfsuite`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure your feature sequences are formatted as `list[list[str]]` where the outer list represents the sequence, and each inner list represents the features for a single token/item in that sequence. Refer to the quickstart for an example.","message":"The input data format for `Trainer.append()` and `Tagger.tag()` requires a list of feature lists for each item in the sequence. Each feature list is typically a list of strings (e.g., `[['feature1', 'feature2'], ['feature3']]`). Incorrectly formatted input will lead to errors.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}