{"id":5326,"library":"model2vec","title":"Model2Vec","description":"Model2Vec is a Python library designed for training and using state-of-the-art static embeddings for various NLP tasks like classification, clustering, and semantic search. Built on top of Hugging Face's `transformers` library, it aims for fast and efficient embedding generation. The current version is 0.8.1, and it maintains an active release cadence with updates typically occurring monthly or bi-monthly.","status":"active","version":"0.8.1","language":"en","source_language":"en","source_url":"https://github.com/MinishLab/model2vec","tags":["NLP","embeddings","natural-language-processing","deep-learning","Hugging Face","semantic-search"],"install":[{"cmd":"pip install model2vec","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Core dependency for underlying model architectures and tokenizers.","package":"transformers","optional":false},{"reason":"Primary deep learning framework for model operations.","package":"torch","optional":false}],"imports":[{"note":"The primary class for loading and encoding embeddings.","symbol":"Model2Vec","correct":"from model2vec import Model2Vec"}],"quickstart":{"code":"from model2vec import Model2Vec\nimport os\n\n# Load a pre-trained model. Specify 'device' for CPU/GPU.\n# Example uses a dummy device for quickstart portability.\nmodel = Model2Vec(\"minishlab/m2v_base\", device=os.environ.get('MODEL2VEC_DEVICE', 'cpu'))\n\n# Get embeddings for some text\nsentences = [\n    \"This is a test sentence for model2vec.\",\n    \"Another example sentence to demonstrate embedding.\"\n]\nembeddings = model.encode(sentences)\n\nprint(f\"Embeddings shape: {embeddings.shape}\")\n# Expected output for base model: Embeddings shape: (2, 768)","lang":"python","description":"Initialize a Model2Vec instance with a pre-trained model from Hugging Face Hub and use it to encode a list of sentences into embeddings."},"warnings":[{"fix":"Refer to the GitHub changelog for v0.5.0 and the updated documentation/examples for `Model2Vec` initialization and usage patterns.","message":"The `v0.5.0` release included a significant 'rewrite backend' which likely introduced breaking changes to the API, particularly around model initialization and internal component access. Users upgrading from versions prior to 0.5.0 may need to refactor their code.","severity":"breaking","affected_versions":"<0.5.0 to 0.5.0"},{"fix":"Upgrade to `model2vec==0.8.1` or newer for improved Windows compatibility.","message":"Prior to `v0.8.1`, Windows users might encounter path-related issues when loading models or processing data due to non-POSIX path handling. This was addressed in v0.8.1.","severity":"gotcha","affected_versions":"<0.8.1"},{"fix":"Ensure your `transformers` library is kept up-to-date, preferably to a version compatible with your `model2vec` installation. Consult the `model2vec` `pyproject.toml` or `setup.py` for exact `transformers` version constraints.","message":"Model2Vec relies heavily on Hugging Face `transformers`. Mismatches or outdated versions of `transformers` can lead to issues with tokenizers, padding, or model loading. Version `v0.7.0` specifically included a fix for padding token recognition and an update to `transformers` usage.","severity":"gotcha","affected_versions":"<0.7.0"}],"env_vars":null,"last_verified":"2026-04-13T00:00:00.000Z","next_check":"2026-07-12T00:00:00.000Z"}