Polygraphy
Polygraphy is a Deep Learning Inference Prototyping and Debugging Toolkit developed by NVIDIA. It helps users run, compare, and debug inference across various deep learning frameworks and backends, especially NVIDIA TensorRT. It provides Python APIs and a command-line interface for tasks like model conversion, correctness checking, and performance profiling. Polygraphy is tightly integrated with the TensorRT ecosystem and typically receives updates in conjunction with TensorRT releases.
Common errors
-
polygraphy.exception.PolygraphyException: Could not find any ONNX backend. Make sure you have onnxruntime installed. Install it with: pip install polygraphy[onnxrt]
cause The ONNX Runtime backend is not installed, or only the minimal `polygraphy` package was installed.fixInstall the ONNX Runtime backend: `pip install polygraphy[onnxrt]`. For all common backends, use `pip install polygraphy[all]`. -
[ERROR] [TRT] ... This version of TensorRT was compiled for CUDA <X.Y> but was linked against CUDA <A.B> ...
cause Your TensorRT Python package (`nvidia-tensorrt`) is compiled for a different CUDA toolkit version than what is installed on your system or configured in your environment.fixEnsure your `nvidia-tensorrt` package matches your CUDA toolkit version. You might need to uninstall and reinstall `nvidia-tensorrt` using the specific wheel file corresponding to your CUDA version and Python. Refer to NVIDIA's TensorRT installation guide for details. -
polygraphy.exception.PolygraphyException: TrtRunner failed to build engine: [TensorRT] ERROR: ... (e.g., 'Unsupported engine field: ...', 'Error Code 7: Internal Error ...')
cause The ONNX model is not fully compatible with TensorRT (e.g., unsupported operations, incorrect input/output definitions, issues with dynamic shapes requiring optimization profiles, or plugin issues).fixInspect the TensorRT error message carefully for clues. Try converting the model with the command-line tool `polygraphy convert --trt --output=engine.plan model.onnx` to get more detailed error reports. Check if custom TensorRT plugins are required for specific operations, or if the model needs simplification/re-export to be TensorRT-compatible. -
TypeError: 'numpy.ndarray' object cannot be interpreted as an integer
cause Often occurs when passing a NumPy array where a Python integer or tuple is expected, particularly when defining input shapes for dynamic models without correctly using `SizingInput`.fixEnsure that parameters expecting integer dimensions (e.g., `max_shapes`) are provided as plain integers or tuples of integers, not NumPy arrays. For dynamic input shapes in `TrtRunner`, use `SizingInput(..., min_shapes, opt_shapes, max_shapes)` with tuples for dimensions.
Warnings
- gotcha Polygraphy's core functionality, especially with the TensorRT backend, requires a compatible NVIDIA GPU, CUDA toolkit, and cuDNN installed. Without these, `TrtRunner` will fail or report errors.
- breaking Polygraphy is typically released alongside specific TensorRT versions. Mismatched Polygraphy and TensorRT versions can lead to unexpected behavior, `PolygraphyException` errors, or failed engine builds.
- deprecated TensorRT 10.13.2 and later (released mid-2025) dropped support for Python versions older than 3.10 for samples and demos. While Polygraphy's `requires_python` is `>=3.6`, using older Python versions (e.g., 3.6, 3.7, 3.8) with recent TensorRT backends may lead to unexpected issues or lack of features.
- gotcha TensorRT requires explicit input shapes for engine building, even when dynamic axes are present. If your model has dynamic input shapes (e.g., batch size), you must provide `input_shapes` or `input_metadata` when using `TrtRunner`.
Install
-
pip install polygraphy[all] -
pip install polygraphy
Imports
- Comparator
from polygraphy.comparator import Comparator
- TrtRunner
from polygraphy.backend.trt import TrtRunner
- OnnxrtRunner
from polygraphy.backend.onnxrt import OnnxrtRunner
- SizingInput
from polygraphy.backend.trt import SizingInput
from polygraphy.comparator import SizingInput
- G_LOGGER
from polygraphy.logger import G_LOGGER
Quickstart
import numpy as np
from polygraphy.comparator import Comparator, SizingInput
from polygraphy.backend.trt import TrtRunner, TrtConfig
from polygraphy.backend.onnxrt import OnnxrtRunner
from polygraphy.logger import G_LOGGER
import os
# --- Dummy ONNX Model Creation (for runnable quickstart) ---
# In a real scenario, you would load your own ONNX model.
# This creates a simple identity model for demonstration purposes.
onnx_model_path = "identity.onnx"
if not os.path.exists(onnx_model_path):
import onnx
graph = onnx.helper.make_graph(
[onnx.helper.make_node("Identity", ["input_0"], ["output_0"])],
"identity_graph",
[onnx.helper.make_tensor_value_info("input_0", onnx.TensorProto.FLOAT, [1, 3, 224, 224])],
[onnx.helper.make_tensor_value_info("output_0", onnx.TensorProto.FLOAT, [1, 3, 224, 224])],
)
onnx_model = onnx.helper.make_model(graph, producer_name="polygraphy-quickstart")
onnx.save(onnx_model, onnx_model_path)
# --- End Dummy Model Creation ---
G_LOGGER.severity = G_LOGGER.INFO # Set logging severity
# Define input data for the model
input_data = {
"input_0": np.random.rand(1, 3, 224, 224).astype(np.float32)
}
with Comparator() as c:
# Add a TensorRT runner
# Input shapes are required for TensorRT
c.add_runner(TrtRunner(
TrtConfig(),
input_shapes=[SizingInput("input_0", (1, 3, 224, 224))]
))
# Add an ONNX-Runtime runner
c.add_runner(OnnxrtRunner())
# Run the comparison
c.run(onnx_model_path, data=input_data)
# Results can be accessed via c.get_comparison_results()
print("Polygraphy comparison completed for", onnx_model_path)
# Clean up the dummy model
os.remove(onnx_model_path)