Flash Linear Attention

JSON →
library 0.4.2 ·python
verified May 24, 2026

Flash Linear Attention (FLA) is a Python library providing efficient, Triton-based implementations for state-of-the-art linear attention models and emerging sequence modeling architectures. It aims for high-performance training and inference across NVIDIA, AMD, and Intel GPUs. As of version 0.4.2, the library is actively maintained with frequent releases, offering optimized kernels, fused modules, and integration-ready layers for PyTorch and Hugging Face models.

total hits 54
actors 12 distinct systems
last hit 5h ago ChatGPT-User
ChatGPT-User
17
ByteDance
6
OAI-SearchBot
6
MetaBot
4
Script
2
Amazonbot
1
PerplexityBot
1
Search engines
2
Humans
6

top countries 🇺🇸 United States · 🇩🇪 Germany · 🇸🇬 Singapore · 🇨🇦 Canada · 🇧🇷 Brazil