Ring Flash Attention
JSON →Ring attention implementation with flash attention for efficient long-context LLM training. Supports distributed memory and compute parallelism. Current version: 0.1.8, actively maintained on GitHub, weekly releases.
Ring attention implementation with flash attention for efficient long-context LLM training. Supports distributed memory and compute parallelism. Current version: 0.1.8, actively maintained on GitHub, weekly releases.