OpenVINO Tokenizers
OpenVINO Tokenizers provides utilities to convert pre-trained tokenizers, primarily from the Hugging Face `transformers` library, into OpenVINO models. These converted tokenizers can then be compiled and run efficiently on various hardware using the OpenVINO runtime, preparing text inputs for OpenVINO-optimized Large Language Models (LLMs). The current version is 2026.1.0.0, and releases typically align with major OpenVINO toolkit releases, often on a quarterly or yearly cadence.
Common errors
-
ModuleNotFoundError: No module named 'transformers'
cause The `transformers` library is not installed, but `AutoTokenizer` or other `transformers` components are being imported or used.fixInstall the Hugging Face `transformers` library: `pip install transformers`. -
RuntimeError: unsupported tokenizer type or missing attributes
cause The tokenizer object passed to `convert_tokenizer` is not a recognized type or lacks attributes/methods that OpenVINO Tokenizers expects for conversion (e.g., `vocab_size`, `ids_to_tokens`).fixEnsure you are using a standard Hugging Face tokenizer (e.g., from `AutoTokenizer.from_pretrained`). If using a custom tokenizer, verify it adheres to the interface expected by `openvino-tokenizers`, possibly by inspecting the source code for expected tokenizer properties. -
TypeError: 'list' object cannot be interpreted as an integer
cause The `convert_tokenizer` function or the OpenVINO tokenizer model input expects a single string or a list of strings, but received an incorrect input format.fixVerify that the input to `convert_tokenizer` is a tokenizer object, and inputs to the compiled OpenVINO tokenizer model are in the expected format, typically a `str` or `List[str]`.
Warnings
- gotcha The `convert_tokenizer` function is primarily designed for tokenizers originating from the Hugging Face `transformers` library. Using custom tokenizer implementations or those from other NLP libraries may lead to conversion errors or unexpected behavior.
- gotcha The `openvino-tokenizers` package version is typically aligned with the major OpenVINO toolkit version it supports. Mismatching versions between `openvino-tokenizers` and the `openvino` runtime package can lead to compatibility issues, especially when dealing with advanced OpenVINO features or specific hardware support.
- breaking Changes in OpenVINO's core operations or model representation between major OpenVINO toolkit versions (e.g., 2024.x to 2026.x) can lead to converted tokenizer models being incompatible with older or newer OpenVINO runtimes, or requiring code adjustments.
Install
-
pip install openvino-tokenizers -
pip install transformers
Imports
- convert_tokenizer
from openvino_tokenizers import convert_tokenizer
Quickstart
from transformers import AutoTokenizer
from openvino_tokenizers import convert_tokenizer
import openvino as ov
# 1. Load a Hugging Face tokenizer (requires `pip install transformers`)
hf_tokenizer = AutoTokenizer.from_pretrained("gpt2")
# 2. Convert the Hugging Face tokenizer to an OpenVINO model
# The output is an ov.Model object
ov_tokenizer_model = convert_tokenizer(hf_tokenizer, tokenizer_name="gpt2_ov_tokenizer")
# 3. Print information about the converted OpenVINO model
print(f"OpenVINO Tokenizer Model Name: {ov_tokenizer_model.get_friendly_name()}")
print(f"Number of inputs: {len(ov_tokenizer_model.inputs)}")
print(f"Number of outputs: {len(ov_tokenizer_model.outputs)}")
# The `ov_tokenizer_model` can now be compiled and used with `openvino.Core()`