NVIDIA Model Optimizer
JSON →NVIDIA Model Optimizer (nvidia-modelopt) is an open toolkit designed to accelerate AI inference by applying state-of-the-art model optimization techniques such as quantization, pruning, and distillation. It primarily targets PyTorch and ONNX models, integrating directly into the training loop and enabling seamless deployment to NVIDIA's inference frameworks like TensorRT-LLM and TensorRT. The library is actively developed, with its current stable version being 0.42.0, and frequent pre-release candidates (e.g., 0.43.0rcX) indicating a rapid release cadence.
Traffic · last 30 days ↑67% vs prev 7d
total hits 23
actors 10 distinct systems
last hit 9h ago ChatGPT-User
top countries 🇺🇸 United States · 🇸🇬 Singapore · 🇫🇷 France · 🇩🇪 Germany · 🇨🇦 Canada
Resources
homepagedeveloper.nvidia.com/tensorrt ↗
API endpoints
full doc /v1/registry/nvidia-modelopt
compatibility /v1/registry/nvidia-modelopt/compatibility