Optimum Quanto
JSON →Optimum Quanto is a PyTorch quantization backend for Hugging Face Optimum, enabling efficient training and inference of large language models (LLMs) and other neural networks with reduced precision (e.g., 8-bit integers or 8-bit floats). It focuses on model optimization for hardware acceleration by integrating with PyTorch's native quantization functionalities. The current version is 0.2.7. As a rapidly evolving library deeply integrated with the Hugging Face ecosystem and PyTorch's quantization efforts, its release cadence is generally frequent, often tied to major Optimum or PyTorch updates.
Traffic · last 30 days ↓64% vs prev 7d
total hits 19
actors 7 distinct systems
last hit 3d ago MetaBot
top countries 🇺🇸 United States · 🇫🇷 France · 🇨🇦 Canada · 🇩🇪 Germany · 🇫🇮 Finland
API endpoints
full doc /v1/registry/optimum-quanto
compatibility /v1/registry/optimum-quanto/compatibility