{"id":14789,"library":"openvino-tokenizers","title":"OpenVINO Tokenizers","description":"OpenVINO Tokenizers provides utilities to convert pre-trained tokenizers, primarily from the Hugging Face `transformers` library, into OpenVINO models. These converted tokenizers can then be compiled and run efficiently on various hardware using the OpenVINO runtime, preparing text inputs for OpenVINO-optimized Large Language Models (LLMs). The current version is 2026.1.0.0, and releases typically align with major OpenVINO toolkit releases, often on a quarterly or yearly cadence.","status":"active","version":"2026.1.0.0","language":"en","source_language":"en","source_url":"https://github.com/openvinotoolkit/openvino_tokenizers","tags":["OpenVINO","NLP","Tokenizers","AI","ML","Hugging Face","Transformers","Inference"],"install":[{"cmd":"pip install openvino-tokenizers","lang":"bash","label":"Install core library"},{"cmd":"pip install transformers","lang":"bash","label":"Install Hugging Face Transformers (recommended for tokenizer source)"}],"dependencies":[{"reason":"Commonly used to load pre-trained tokenizers which are then converted by openvino-tokenizers. Not a direct runtime dependency of openvino-tokenizers, but required for most typical workflows.","package":"transformers","optional":true},{"reason":"Core OpenVINO library is required for compiling and running the converted tokenizer models.","package":"openvino","optional":false}],"imports":[{"symbol":"convert_tokenizer","correct":"from openvino_tokenizers import convert_tokenizer"}],"quickstart":{"code":"from transformers import AutoTokenizer\nfrom openvino_tokenizers import convert_tokenizer\nimport openvino as ov\n\n# 1. Load a Hugging Face tokenizer (requires `pip install transformers`)\nhf_tokenizer = AutoTokenizer.from_pretrained(\"gpt2\")\n\n# 2. Convert the Hugging Face tokenizer to an OpenVINO model\n# The output is an ov.Model object\nov_tokenizer_model = convert_tokenizer(hf_tokenizer, tokenizer_name=\"gpt2_ov_tokenizer\")\n\n# 3. Print information about the converted OpenVINO model\nprint(f\"OpenVINO Tokenizer Model Name: {ov_tokenizer_model.get_friendly_name()}\")\nprint(f\"Number of inputs: {len(ov_tokenizer_model.inputs)}\")\nprint(f\"Number of outputs: {len(ov_tokenizer_model.outputs)}\")\n\n# The `ov_tokenizer_model` can now be compiled and used with `openvino.Core()`","lang":"python","description":"This quickstart demonstrates how to load a standard Hugging Face tokenizer and convert it into an OpenVINO model. The resulting `ov.Model` object can then be compiled and run by the OpenVINO runtime. Note that `transformers` is a common prerequisite for obtaining the initial tokenizer object."},"warnings":[{"fix":"Ensure the tokenizer object passed to `convert_tokenizer` is a standard Hugging Face tokenizer instance (e.g., from `AutoTokenizer.from_pretrained`). If using a custom tokenizer, verify it implements all necessary attributes and methods expected by OpenVINO Tokenizers.","message":"The `convert_tokenizer` function is primarily designed for tokenizers originating from the Hugging Face `transformers` library. Using custom tokenizer implementations or those from other NLP libraries may lead to conversion errors or unexpected behavior.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Always install `openvino-tokenizers` and `openvino` packages with matching major version numbers (e.g., both 2024.2.x or both 2026.1.x) to ensure full compatibility.","message":"The `openvino-tokenizers` package version is typically aligned with the major OpenVINO toolkit version it supports. Mismatching versions between `openvino-tokenizers` and the `openvino` runtime package can lead to compatibility issues, especially when dealing with advanced OpenVINO features or specific hardware support.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Consult the release notes for both `openvino-tokenizers` and the main `openvino` toolkit when upgrading across major versions. Re-convert tokenizers with the new `openvino-tokenizers` version if you upgrade your `openvino` runtime.","message":"Changes in OpenVINO's core operations or model representation between major OpenVINO toolkit versions (e.g., 2024.x to 2026.x) can lead to converted tokenizer models being incompatible with older or newer OpenVINO runtimes, or requiring code adjustments.","severity":"breaking","affected_versions":"Across major OpenVINO versions (e.g., 2024.x -> 2026.x)"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Install the Hugging Face `transformers` library: `pip install transformers`.","cause":"The `transformers` library is not installed, but `AutoTokenizer` or other `transformers` components are being imported or used.","error":"ModuleNotFoundError: No module named 'transformers'"},{"fix":"Ensure you are using a standard Hugging Face tokenizer (e.g., from `AutoTokenizer.from_pretrained`). If using a custom tokenizer, verify it adheres to the interface expected by `openvino-tokenizers`, possibly by inspecting the source code for expected tokenizer properties.","cause":"The tokenizer object passed to `convert_tokenizer` is not a recognized type or lacks attributes/methods that OpenVINO Tokenizers expects for conversion (e.g., `vocab_size`, `ids_to_tokens`).","error":"RuntimeError: unsupported tokenizer type or missing attributes"},{"fix":"Verify that the input to `convert_tokenizer` is a tokenizer object, and inputs to the compiled OpenVINO tokenizer model are in the expected format, typically a `str` or `List[str]`.","cause":"The `convert_tokenizer` function or the OpenVINO tokenizer model input expects a single string or a list of strings, but received an incorrect input format.","error":"TypeError: 'list' object cannot be interpreted as an integer"}],"ecosystem":"pypi"}