SageAttention
JSON →SageAttention is a Python library providing accurate and efficient 8-bit plug-and-play attention mechanisms, including Mixture-of-Experts (MoE) implementations. It aims to accelerate large language models with minimal performance drop. The current bleeding-edge version is 2.0.1, though the PyPI package might lag behind GitHub releases. Releases typically occur when major architectural changes or significant features are implemented.
Traffic · last 30 days ↑14% vs prev 7d
total hits 16
actors 7 distinct systems
last hit 3d ago MetaBot
top countries 🇺🇸 United States · 🇨🇦 Canada · 🇫🇷 France · 🇩🇪 Germany
API endpoints
full doc /v1/registry/sageattention
compatibility /v1/registry/sageattention/compatibility