Transformer Engine (CUDA 12)
JSON →Transformer Engine (TE) is a Python library by NVIDIA for accelerating Transformer models on NVIDIA GPUs. It enables lower precision training and inference, notably supporting 8-bit (FP8) and 4-bit (NVFP4) floating point precision on Hopper, Ada, and Blackwell GPUs, leading to better performance and reduced memory utilization. It provides highly optimized building blocks for popular Transformer architectures and an automatic mixed precision-like API for PyTorch and JAX. The current version is 2.13.0, with an active release cadence, often aligning with new NVIDIA hardware and software advancements.
Traffic · last 30 days ↑800% vs prev 7d
total hits 29
actors 8 distinct systems
last hit 1d ago AhrefsBot
top countries 🇺🇸 United States · 🇩🇪 Germany · 🇸🇬 Singapore · 🇨🇦 Canada · NZ
Resources
API endpoints
full doc /v1/registry/transformer-engine-cu12
compatibility /v1/registry/transformer-engine-cu12/compatibility