NVIDIA TensorRT Model Optimizer Core
JSON →The NVIDIA TensorRT Model Optimizer (ModelOpt) provides a unified toolkit for model optimization and deployment across NVIDIA GPUs, supporting quantization (PTQ, QAT), pruning, distillation, and TensorRT export. As of v0.33.1, the library is actively maintained and targets Python 3.10–3.12. Release cadence is approximately monthly.
Resources
API endpoints
full doc /v1/registry/nvidia-modelopt-core