TensorRT CUDA 12 Bindings

10.16.1.11 · active · verified Thu Apr 16

TensorRT-cu12-bindings provides Python bindings for NVIDIA's TensorRT, a high-performance deep learning inference optimizer and runtime. This specific package targets CUDA 12.x environments. It is actively developed by NVIDIA, with frequent releases aligning with major TensorRT and CUDA versions, typically every few months.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the TensorRT builder, define a simple identity network with an explicit batch dimension, configure the builder, and build a TensorRT engine. This is the fundamental process for optimizing and compiling deep learning models for NVIDIA GPUs.

import tensorrt as trt
import numpy as np

# 1. Create a logger (TRT_LOGGER = trt.Logger(trt.Logger.INFO) for more verbose output)
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)

# 2. Create builder, network, and configuration
builder = trt.Builder(TRT_LOGGER)
# Explicit batch is required for some features (e.g., dynamic shapes)
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
config = builder.create_builder_config()

# Configure builder options
# max_workspace_size: The maximum GPU memory size (in bytes) that TensorRT can use for temporary buffers.
config.max_workspace_size = 1 << 20  # 1 MiB (adjust as needed for larger models)

# 3. Define the network: a simple identity layer for demonstration
# Input shape (batch_size, channels, height, width)
input_shape = (1, 3, 224, 224)
input_tensor = network.add_input(name="input_tensor", dtype=trt.float32, shape=input_shape)

# Add an identity layer as a simple example operation
identity_layer = network.add_identity(input_tensor)
output_tensor = identity_layer.get_output(0)

# Mark the output tensor
network.mark_output(output_tensor)
output_tensor.name = "output_tensor"

# 4. Build the engine
print(f"Building TensorRT engine with input shape {input_shape}...")
engine = builder.build_engine(network, config)

if engine:
    print("TensorRT engine built successfully!")
    # Example: serialize the engine to disk
    # with open("my_identity_engine.trt", "wb") as f:
    #     f.write(engine.serialize())
    # print("Engine serialized to my_identity_engine.trt")
else:
    print("Failed to build TensorRT engine.")

# Cleanup
del network, builder, config, engine

view raw JSON →