SageAttention

library 2.0.1 ·python

✓ verified May 27, 2026

SageAttention is a Python library providing accurate and efficient 8-bit plug-and-play attention mechanisms, including Mixture-of-Experts (MoE) implementations. It aims to accelerate large language models with minimal performance drop. The current bleeding-edge version is 2.0.1, though the PyPI package might lag behind GitHub releases. Releases typically occur when major architectural changes or significant features are implemented.

Traffic · last 30 days ↑14% vs prev 7d · indexed Fri Apr 17 · updated Mon Jun 01

total hits 16

actors 7 distinct systems

last hit 3d ago MetaBot

GPTBot

MetaBot

Script

ClaudeBot

ChatGPT-User

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇫🇷 France · 🇩🇪 Germany

Resources

githubgithub.com/thu-ml/SageAttention ↗

packagepypi.org/project/sageattention/ ↗

API endpoints

full doc /v1/registry/sageattention

install /v1/registry/sageattention/install

compatibility /v1/registry/sageattention/compatibility