Optimum Quanto

library 0.2.7 ·python

✓ verified May 25, 2026

Optimum Quanto is a PyTorch quantization backend for Hugging Face Optimum, enabling efficient training and inference of large language models (LLMs) and other neural networks with reduced precision (e.g., 8-bit integers or 8-bit floats). It focuses on model optimization for hardware acceleration by integrating with PyTorch's native quantization functionalities. The current version is 0.2.7. As a rapidly evolving library deeply integrated with the Hugging Face ecosystem and PyTorch's quantization efforts, its release cadence is generally frequent, often tied to major Optimum or PyTorch updates.

Traffic · last 30 days ↓64% vs prev 7d · indexed Thu Apr 16 · updated Mon Jun 01

total hits 19

actors 7 distinct systems

last hit 3d ago MetaBot

MetaBot

GPTBot

Script

ClaudeBot

Search engines

Humans

top countries 🇺🇸 United States · 🇫🇷 France · 🇨🇦 Canada · 🇩🇪 Germany · 🇫🇮 Finland

Resources

githubgithub.com/huggingface/optimum-quanto ↗

packagepypi.org/project/optimum-quanto/ ↗

API endpoints

full doc /v1/registry/optimum-quanto

install /v1/registry/optimum-quanto/install

compatibility /v1/registry/optimum-quanto/compatibility