{"id":9160,"library":"optimum-intel","title":"Optimum Intel","description":"Optimum Intel extends the Hugging Face Transformers and Diffusers libraries, providing a framework to integrate Intel's specialized tools and libraries like OpenVINO, Neural Compressor, and Intel Extension for PyTorch. It enables optimization, conversion (e.g., to OpenVINO IR format), and accelerated inference of deep learning models on Intel architectures. The library is actively maintained with frequent minor version releases, currently at 1.27.0.","status":"active","version":"1.27.0","language":"en","source_language":"en","source_url":"https://github.com/huggingface/optimum-intel","tags":["huggingface","intel","optimization","nlp","transformers","diffusers","onnx","openvino","pytorch","quantization","inference"],"install":[{"cmd":"pip install --upgrade-strategy eager \"optimum-intel[openvino]\"","lang":"bash","label":"Recommended with OpenVINO"},{"cmd":"pip install optimum-intel","lang":"bash","label":"Base installation"}],"dependencies":[{"reason":"Core Optimum library for hardware optimizations.","package":"optimum","optional":false},{"reason":"Hugging Face Transformers models are the primary target for optimization.","package":"transformers","optional":false},{"reason":"PyTorch backend often used for original models and post-processing.","package":"torch","optional":false},{"reason":"Required by core `optimum-intel` for ONNX capabilities.","package":"optimum-onnx","optional":false},{"reason":"Required for OpenVINO runtime and model conversion.","package":"openvino","optional":true},{"reason":"Required for Neural Network Compression Framework (NNCF) quantization features.","package":"nncf","optional":true},{"reason":"Required for IPEX optimizations.","package":"intel-extension-for-pytorch","optional":true},{"reason":"Required for optimizing and inferencing Hugging Face Diffusers models (e.g., Stable Diffusion).","package":"diffusers","optional":true}],"imports":[{"note":"Use `OVModelForCausalLM` for OpenVINO-optimized causal language models.","wrong":"from transformers import AutoModelForCausalLM","symbol":"OVModelForCausalLM","correct":"from optimum.intel import OVModelForCausalLM"},{"note":"Use `OVModelForSeq2SeqLM` for OpenVINO-optimized sequence-to-sequence models.","wrong":"from transformers import AutoModelForSeq2SeqLM","symbol":"OVModelForSeq2SeqLM","correct":"from optimum.intel import OVModelForSeq2SeqLM"},{"note":"Use `OVStableDiffusionPipeline` for OpenVINO-optimized Diffusers Stable Diffusion pipelines.","wrong":"from diffusers import StableDiffusionPipeline","symbol":"OVStableDiffusionPipeline","correct":"from optimum.intel import OVStableDiffusionPipeline"},{"note":"Older versions used `lpot` which was renamed to `neural_compressor`, then simplified to direct import under `optimum.intel`.","wrong":"from optimum.intel.lpot.quantization import LpotQuantizerForSequenceClassification","symbol":"INCModelForSequenceClassification","correct":"from optimum.intel import INCModelForSequenceClassification"}],"quickstart":{"code":"from transformers import AutoTokenizer, pipeline\nfrom optimum.intel import OVModelForSequenceClassification\n\nmodel_id = \"distilbert-base-uncased-finetuned-sst-2-english\"\ntokenizer = AutoTokenizer.from_pretrained(model_id)\n# Load and convert the model to OpenVINO IR format on the fly\nmodel = OVModelForSequenceClassification.from_pretrained(model_id, export=True)\n\n# Run inference\nclassifier = pipeline(\"text-classification\", model=model, tokenizer=tokenizer)\nresults = classifier(\"Optimum Intel is great!\")\nprint(results)","lang":"python","description":"This quickstart demonstrates loading a pre-trained sentiment analysis model, converting it to OpenVINO Intermediate Representation (IR) format on the fly using `export=True`, and running inference with a Hugging Face pipeline. Ensure `optimum-intel[openvino]` and `transformers` are installed."},"warnings":[{"fix":"For future compatibility, install `optimum` with relevant extras (e.g., `pip install optimum[openvino]`) or install `optimum-intel` then individual dependencies like `openvino-dev`.","message":"The installation extras for specific backends (e.g., `[openvino]`, `[nncf]`, `[neural-compressor]`, `[ipex]`) via `pip install optimum-intel[...]` are deprecated and will be removed in a future release. Users are encouraged to install `optimum` and its specific backend extras directly or install `optimum-intel` base and then the backend libraries separately.","severity":"deprecated","affected_versions":">=1.27.0"},{"fix":"Review your quantization configurations and migrate to supported quantization modes, such as INT8 or INT4 weight-only quantization with NNCF.","message":"The `nf4_fp8` quantization modes have been removed. Code relying on these specific quantization modes will break.","severity":"breaking","affected_versions":">=1.27.0"},{"fix":"It is recommended to limit the number of CPU threads used by PyTorch with `torch.set_num_threads()` to mitigate this interaction.","message":"When using OpenVINO Runtime with PyTorch for post-processing (e.g., beam search), OpenVINO's default threading (oneTBB) can interact poorly with PyTorch's OpenMP, leading to performance degradation or delays.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Update your import statements. For example, change `from optimum.intel.lpot.quantization import LpotQuantizerForSequenceClassification` to `from optimum.intel import INCModelForSequenceClassification` (or similar `INCModelForXxx` class).","cause":"The `lpot` subpackage was renamed to `neural_compressor`, and its import paths changed.","error":"ModuleNotFoundError: No module named 'optimum.intel.lpot'"},{"fix":"Ensure OpenVINO is correctly installed and its dependencies are met. Check the full traceback for more specific OpenVINO errors. Try specifying `device=\"CPU\"` or `device=\"GPU\"` explicitly. Save the model after conversion using `model.save_pretrained()` and then load it without `export=True` for debugging. Refer to OpenVINO documentation for device-specific requirements.","cause":"This often occurs during `from_pretrained(export=True)` or `model.compile_model()` if there are issues with the OpenVINO environment, device compatibility, or model specifics (e.g., unsupported operations on the target device, or missing OpenVINO development tools).","error":"RuntimeError: [ ERROR ] Failed to compile the model."},{"fix":"Investigate concurrency patterns and potential race conditions in your application. Ensure proper resource synchronization or consider processing inferences sequentially if parallel calls are causing instability. Monitor memory usage.","cause":"This issue has been observed with concurrent inference calls using OpenVINO, suggesting a potential race condition or resource management problem under heavy load or parallel execution.","error":"Segmentation fault (core dumped) during inference with OpenVINO."}]}