NVIDIA TensorRT for CUDA 12

10.16.1.11 · active · verified Thu Apr 16

TensorRT is a high-performance deep learning inference optimizer and runtime from NVIDIA. The `tensorrt-cu12` package provides the Python bindings specifically compiled for CUDA Toolkit 12.x. As of its latest version `10.16.1.11`, it supports optimizing and deploying trained deep learning models for faster inference on NVIDIA GPUs. Releases are frequent, typically aligning with major TensorRT core library and CUDA toolkit updates.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates the foundational steps of initializing TensorRT components: a logger, a builder, and a network. It serves as a basic sanity check for TensorRT installation and API access. A complete workflow would involve parsing an existing model (e.g., ONNX, UFF), configuring optimization profiles, and building an inference engine.

import tensorrt as trt
import os

# A basic example: creating a TensorRT builder and network
# Note: A real application would involve loading an ONNX/UFF model and building an engine.

# Create a logger to track verbose output and errors
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)

try:
    # Create a builder
    builder = trt.Builder(TRT_LOGGER)
    print(f"TensorRT Builder created successfully. Max batch size: {builder.max_batch_size}")

    # Create an empty network definition. EXPLICIT_BATCH is crucial for modern TensorRT.
    network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
    print("Network created with EXPLICIT_BATCH flag.")

    # Example: Add an input layer (simplified, a real model would have specific shapes)
    input_tensor = network.add_input(name='input_tensor', dtype=trt.float32, shape=(1, 3, 224, 224))
    print(f"Added input tensor with shape {input_tensor.shape}")

    # In a real scenario, you'd parse a model, e.g., using trt.OnnxParser(network, TRT_LOGGER)
    # and then configure the builder for engine creation and serialization.

except Exception as e:
    print(f"An error occurred: {e}")
    print("Please ensure you have a compatible NVIDIA GPU and the correct CUDA/cuDNN installations.")

# Clean up resources (important for complex applications)
# Note: In a production script, `del` might not be strictly necessary if objects go out of scope,
# but it's good practice for clarity or long-running processes.
# Also, ensure network and builder are valid objects before attempting to delete.
if 'network' in locals() and network is not None: del network
if 'builder' in locals() and builder is not None: del builder
if 'TRT_LOGGER' in locals() and TRT_LOGGER is not None: del TRT_LOGGER

view raw JSON →