Pre-compiled cubins for FlashInfer

JSON →
library 0.6.7.post3 ·python
verified May 22, 2026

FlashInfer-cubin provides pre-compiled kernel binaries for FlashInfer, supporting a wide range of GPU architectures. This optional package for `flashinfer-python` eliminates JIT compilation and downloading overhead at runtime, leading to faster initialization and enabling offline usage. The FlashInfer project focuses on delivering high-performance LLM GPU kernels for serving and inference, maintaining an active development cycle with frequent nightly builds and regular patch releases.

total hits 28
actors 8 distinct systems
last hit 1d ago ByteDance
ByteDance
7
ChatGPT-User
4
Script
2
OAI-SearchBot
2
Sogou
1
Search engines
1
Humans
7

top countries 🇸🇬 Singapore · 🇺🇸 United States · 🇺🇦 Ukraine · 🇩🇪 Germany · 🇵🇱 Poland