{"id":3726,"library":"optimum","title":"Hugging Face Optimum","description":"Optimum is an extension of the Hugging Face Transformers library, providing a framework to integrate third-party libraries from Hardware Partners and interface with their specific functionality. It focuses on optimizing models for various accelerators and runtimes, enabling faster training and inference for Transformer-based models. The library is actively developed, with its current version being 2.1.0, and receives frequent updates and new features.","status":"active","version":"2.1.0","language":"en","source_language":"en","source_url":"https://github.com/huggingface/optimum","tags":["huggingface","mlops","onnx","optimization","inference","transformers","ai-accelerators"],"install":[{"cmd":"pip install optimum","lang":"bash","label":"Base Installation"},{"cmd":"pip install \"optimum[onnxruntime]\"","lang":"bash","label":"With ONNX Runtime (for v1.x behavior, or specific integrations)"},{"cmd":"pip install optimum-onnx","lang":"bash","label":"For ONNX functionality (v2.x onwards, required for ONNX Runtime)"}],"dependencies":[{"reason":"Core dependency; Optimum extends and optimizes Hugging Face Transformers models.","package":"transformers","optional":false},{"reason":"Contains ONNX export and ONNX Runtime inference integrations, split from main `optimum` in v2.0.0.","package":"optimum-onnx","optional":true},{"reason":"Runtime for executing ONNX models; typically installed via an extra like `optimum[onnxruntime]` or `optimum-onnx[onnxruntime]`.","package":"onnxruntime","optional":true}],"imports":[{"symbol":"ORTModelForCausalLM","correct":"from optimum.onnxruntime import ORTModelForCausalLM"},{"symbol":"ORTModelForSequenceClassification","correct":"from optimum.onnxruntime import ORTModelForSequenceClassification"},{"note":"For ONNX Runtime optimized pipelines, use `optimum.onnxruntime.pipeline` instead of `transformers.pipeline` for direct compatibility with `ORTModel` classes.","wrong":"from transformers import pipeline","symbol":"pipeline","correct":"from optimum.onnxruntime import pipeline"}],"quickstart":{"code":"from transformers import AutoTokenizer\nfrom optimum.onnxruntime import ORTModelForCausalLM, pipeline\nimport os\n\n# Load an already optimized ONNX Runtime model from the Hugging Face Hub\nmodel_id = \"optimum/gpt2\"\ntokenizer = AutoTokenizer.from_pretrained(model_id)\nmodel = ORTModelForCausalLM.from_pretrained(model_id)\n\n# Create a pipeline using the ONNX Runtime optimized model\ntext_generator = pipeline(\"text-generation\", model=model, tokenizer=tokenizer)\n\n# Generate text\nprompt = \"My name is Philipp\"\nresult = text_generator(prompt, max_new_tokens=10)\nprint(result)","lang":"python","description":"This quickstart demonstrates how to load an existing ONNX Runtime optimized model from the Hugging Face Hub and use it with a Hugging Face pipeline for accelerated text generation. It utilizes `ORTModelForCausalLM` and `optimum.onnxruntime.pipeline` for seamless integration and inference."},"warnings":[{"fix":"Install `optimum-onnx` alongside `optimum`. For ONNX Runtime support, use `pip install \"optimum-onnx[onnxruntime]\"`.","message":"Breaking Change in v2.0.0: ONNX integration (export and ONNX Runtime inference) was moved to a separate package, `optimum-onnx`. Users upgrading from v1.x must install `optimum-onnx` (e.g., `pip install \"optimum-onnx[onnxruntime]\"`) to retain ONNX functionality.","severity":"breaking","affected_versions":">=2.0.0"},{"fix":"Replace imports and usage of `AutoGPTQ` with `GPTQModel`.","message":"`AutoGPTQ` functionality has been fully deprecated in v2.1.0 in favor of `GPTQModel`. Users should migrate to `GPTQModel` for quantization-aware training and inference with GPTQ.","severity":"deprecated","affected_versions":">=2.1.0"},{"fix":"Consult `optimum` documentation for alternative approaches or specific sub-libraries if these functionalities are required.","message":"In v2.0.0, support for TF Lite, BetterTransformer, and ONNX Runtime Training was deprecated and subsequently removed or moved to other specialized packages. TensorFlow model export was also removed.","severity":"deprecated","affected_versions":">=2.0.0"},{"fix":"Review model loading calls; `export=True` is often no longer needed as the export process can be inferred or handled by `optimum-cli`.","message":"The `export=True` argument in `ORTModelForCausalLM.from_pretrained` (and similar `ORTModel` classes) became optional and often inferred from v1.25.0 onwards. Using it explicitly might not be necessary or could lead to unexpected behavior if not handled correctly in newer versions.","severity":"gotcha","affected_versions":">=1.25.0"},{"fix":"Upgrade `optimum` to the latest version to ensure compatibility with recent `transformers` releases. Always check release notes for specific version requirements.","message":"Compatibility issues can arise between specific `optimum` versions and newer `transformers` versions, particularly regarding internal module imports like `TF2_WEIGHTS_NAME` (fixed in v2.1.0). Ensure your `optimum` and `transformers` installations are compatible.","severity":"gotcha","affected_versions":"<2.1.0 with newer Transformers"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}