Pre-compiled cubins for FlashInfer

library 0.6.7.post3 ·python

✓ verified Jun 28, 2026

FlashInfer-cubin provides pre-compiled kernel binaries for FlashInfer, supporting a wide range of GPU architectures. This optional package for `flashinfer-python` eliminates JIT compilation and downloading overhead at runtime, leading to faster initialization and enabling offline usage. The FlashInfer project focuses on delivering high-performance LLM GPU kernels for serving and inference, maintaining an active development cycle with frequent nightly builds and regular patch releases.