Transformer Engine (CUDA 12)

library 2.13.0 ·python

✓ verified May 25, 2026

Transformer Engine (TE) is a Python library by NVIDIA for accelerating Transformer models on NVIDIA GPUs. It enables lower precision training and inference, notably supporting 8-bit (FP8) and 4-bit (NVFP4) floating point precision on Hopper, Ada, and Blackwell GPUs, leading to better performance and reduced memory utilization. It provides highly optimized building blocks for popular Transformer architectures and an automatic mixed precision-like API for PyTorch and JAX. The current version is 2.13.0, with an active release cadence, often aligning with new NVIDIA hardware and software advancements.

Traffic · last 30 days ↑800% vs prev 7d · indexed Thu Apr 16 · updated Mon Jun 01

total hits 29

actors 8 distinct systems

last hit 1d ago AhrefsBot

ChatGPT-User

OAI-SearchBot

MetaBot

ByteDance

Script

Search engines

Humans

top countries 🇺🇸 United States · 🇩🇪 Germany · 🇸🇬 Singapore · 🇨🇦 Canada · NZ

Resources

packagepypi.org/project/transformer-engine-cu12/ ↗

homepagewww.nvidia.com/en-us/data-center/products/transformer-engine/ ↗

API endpoints

full doc /v1/registry/transformer-engine-cu12

install /v1/registry/transformer-engine-cu12/install

compatibility /v1/registry/transformer-engine-cu12/compatibility