AutoRound
JSON →AutoRound is an advanced weight-only quantization algorithm for large language models (LLMs), providing up to 4-bit quantization with minimal accuracy loss. Current version 0.13.0 supports various Intel and AMD GPUs, as well as CPUs. The package is under active development by Intel.