AutoAWQ
JSON →AutoAWQ implements the AWQ (Activation-aware Weight Quantization) algorithm for 4-bit quantization of large language models, achieving up to 2x speedup during inference. The library is now deprecated as of v0.2.9 (April 2025), with vLLM having adopted the technology. Last tested with Torch 2.6.0 and Transformers 4.51.3.
Traffic · last 30 days ↑0% vs prev 7d
total hits 28
actors 8 distinct systems
last hit 23h ago ByteDance
top countries 🇺🇸 United States · 🇸🇬 Singapore · 🇩🇪 Germany · 🇨🇦 Canada · VN