TensorRT CUDA 13 Python Bindings

10.16.1.11 · active · verified Fri Apr 17

tensorrt-cu13-bindings provides Python bindings for NVIDIA's TensorRT, a high-performance deep learning inference library. It enables developers to optimize, validate, and deploy trained deep learning models on NVIDIA GPUs. The library is actively maintained with frequent minor releases, typically on a monthly to bi-monthly cadence, aligned with new TensorRT versions and CUDA compatibility updates.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the TensorRT logger, create a builder and network, define a simple input and output, and build a serialized engine. This process is fundamental for converting a deep learning model into a TensorRT optimized engine. Note that actual inference would require creating an `IExecutionContext` and managing device memory.

import tensorrt as trt
import numpy as np

# A simple example: create a dummy network and engine
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)

def build_engine():
    builder = trt.Builder(TRT_LOGGER)
    network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
    config = builder.create_builder_config()

    # Define input tensor
    input_tensor = network.add_input(name='input_data', dtype=trt.float32, shape=(1, 3, 224, 224))

    # Define a simple operation (e.g., identity for demonstration)
    output_tensor = input_tensor

    # Mark output
    network.mark_output(output_tensor)

    # Build engine (requires GPU and sufficient memory)
    print("Building TensorRT engine...")
    serialized_engine = builder.build_serialized_network(network, config)
    if serialized_engine is None:
        raise RuntimeError("Failed to build TensorRT engine.")
    
    print("Engine built successfully.")
    return serialized_engine

if __name__ == '__main__':
    try:
        serialized_engine = build_engine()
        runtime = trt.Runtime(TRT_LOGGER)
        engine = runtime.deserialize_cuda_engine(serialized_engine)
        
        print(f"Engine name: {engine.name}")
        print(f"Number of bindings: {engine.num_bindings}")
    except Exception as e:
        print(f"An error occurred: {e}")
        print("Ensure you have a compatible NVIDIA GPU, CUDA Toolkit, and cuDNN installed.")

view raw JSON →