TensorRT CUDA 13 Python Bindings
tensorrt-cu13-bindings provides Python bindings for NVIDIA's TensorRT, a high-performance deep learning inference library. It enables developers to optimize, validate, and deploy trained deep learning models on NVIDIA GPUs. The library is actively maintained with frequent minor releases, typically on a monthly to bi-monthly cadence, aligned with new TensorRT versions and CUDA compatibility updates.
Common errors
-
tensorrt.tensorrt.UnsatisfiedLinkError: Could not load library: nvinfer.so
cause The TensorRT core libraries (e.g., `nvinfer.so`, `libnvinfer.so.X`) cannot be found or loaded by the Python bindings. This often indicates missing or incompatible CUDA/cuDNN installations, or an incorrect `LD_LIBRARY_PATH` (on Linux).fixEnsure the NVIDIA CUDA Toolkit and cuDNN are correctly installed and their library paths (e.g., `/usr/local/cuda/lib64`) are included in your system's `LD_LIBRARY_PATH` environment variable. Also verify the installed `tensorrt-cuXX-bindings` package matches your CUDA version. -
[TRT] ERROR: ../builder/Network.cpp:xxx Nvinfer error: ...
cause A generic error during the engine building process, often indicating an issue with the network definition (e.g., unsupported layer, invalid input shapes, not marking outputs) or resource constraints.fixReview your model's operations for TensorRT compatibility, ensure all network inputs and outputs are correctly defined and marked. Increase workspace size if memory is an issue using `config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, ...)`. -
CUDA driver version is insufficient for CUDA runtime version
cause The installed NVIDIA driver on your system is too old or incompatible with the CUDA runtime version that the TensorRT package expects. The `tensorrt-cuXX-bindings` package depends on a specific CUDA runtime, which in turn needs a compatible driver.fixUpdate your NVIDIA GPU drivers to a version compatible with your CUDA Toolkit and the `tensorrt-cuXX-bindings` package. Refer to NVIDIA's CUDA Toolkit release notes for driver compatibility tables. -
[TensorRT] ERROR: Network must have at least one output
cause When building an `INetworkDefinition` in TensorRT, you must explicitly mark at least one tensor as an output using `network.mark_output(tensor)`.fixBefore building the serialized network, ensure you have called `network.mark_output()` for every tensor you intend to be an output of your TensorRT engine.
Warnings
- breaking TensorRT regularly updates its required CUDA and Python versions. Version 10.13.2 dropped support for CUDA 11.x and Python < 3.10 for samples/demos, and v10.16 defaulted to CUDA 13.2. Always check the release notes for your specific package version.
- breaking As of TensorRT 10.14, sample code is no longer included with the PyPI packages. They are now exclusively hosted in the TensorRT GitHub repository.
- deprecated TensorRT has been gradually migrating plugins from `IPluginV2`-descendent versions to `IPluginV3`. Older plugin versions are deprecated and will be removed in future releases.
- gotcha The `tensorrt-cuXX-bindings` package name explicitly ties it to a major CUDA version (e.g., `cu13` for CUDA 13.x). Installing a package that doesn't match your system's CUDA installation (or driver compatibility) will lead to runtime errors.
Install
-
pip install tensorrt-cu13-bindings
Imports
- tensorrt
import trt
import tensorrt as trt
Quickstart
import tensorrt as trt
import numpy as np
# A simple example: create a dummy network and engine
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
def build_engine():
builder = trt.Builder(TRT_LOGGER)
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
config = builder.create_builder_config()
# Define input tensor
input_tensor = network.add_input(name='input_data', dtype=trt.float32, shape=(1, 3, 224, 224))
# Define a simple operation (e.g., identity for demonstration)
output_tensor = input_tensor
# Mark output
network.mark_output(output_tensor)
# Build engine (requires GPU and sufficient memory)
print("Building TensorRT engine...")
serialized_engine = builder.build_serialized_network(network, config)
if serialized_engine is None:
raise RuntimeError("Failed to build TensorRT engine.")
print("Engine built successfully.")
return serialized_engine
if __name__ == '__main__':
try:
serialized_engine = build_engine()
runtime = trt.Runtime(TRT_LOGGER)
engine = runtime.deserialize_cuda_engine(serialized_engine)
print(f"Engine name: {engine.name}")
print(f"Number of bindings: {engine.num_bindings}")
except Exception as e:
print(f"An error occurred: {e}")
print("Ensure you have a compatible NVIDIA GPU, CUDA Toolkit, and cuDNN installed.")