Pre-compiled cubins for FlashInfer
JSON →FlashInfer-cubin provides pre-compiled kernel binaries for FlashInfer, supporting a wide range of GPU architectures. This optional package for `flashinfer-python` eliminates JIT compilation and downloading overhead at runtime, leading to faster initialization and enabling offline usage. The FlashInfer project focuses on delivering high-performance LLM GPU kernels for serving and inference, maintaining an active development cycle with frequent nightly builds and regular patch releases.
Traffic · last 30 days ↑225% vs prev 7d
total hits 28
actors 8 distinct systems
last hit 1d ago ByteDance
top countries 🇸🇬 Singapore · 🇺🇸 United States · 🇺🇦 Ukraine · 🇩🇪 Germany · 🇵🇱 Poland
Resources
API endpoints
full doc /v1/registry/flashinfer-cubin
compatibility /v1/registry/flashinfer-cubin/compatibility