Triton Performance Analyzer

library 2.59.1 ·python

✓ verified Apr 13, 2026

Triton Performance Analyzer (perf_analyzer) is a command-line interface (CLI) tool designed to optimize the inference performance of models running on the NVIDIA Triton Inference Server. It measures key metrics such as throughput and latency by generating inference requests to your model and repeating measurements until stable values are achieved. The library is currently at version 2.59.1 and follows the release cadence of the broader Triton Inference Server project.

Traffic · last 30 days ↑18% vs prev 7d · indexed Tue Apr 14 · updated Fri May 29

total hits 35

actors 8 distinct systems

last hit 1d ago ChatGPT-User

ChatGPT-User

Script

GPTBot

ClaudeBot

Search engines

Humans

top countries 🇺🇸 United States · 🇩🇪 Germany · 🇨🇦 Canada · VN · 🇫🇷 France

Resources

packagepypi.org/project/perf-analyzer/ ↗

homepagedeveloper.nvidia.com/triton-inference-server ↗

API endpoints

full doc /v1/registry/perf-analyzer

install /v1/registry/perf-analyzer/install