ONNX Runtime GenAI

JSON →
library 0.13.1 ·python
verified May 24, 2026

ONNX Runtime GenAI is a Python library that provides an easy, flexible, and performant way to run Generative AI models (Large Language Models and multi-modal models) on-device and in the cloud using ONNX Runtime. It encapsulates the complete generative AI loop, including pre- and post-processing, inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. The library is actively developed, with version 0.13.1 released in April 2026, generally following a quarterly release cadence in line with the broader ONNX Runtime project.

total hits 17
actors 8 distinct systems
last hit 22h ago AhrefsBot
MetaBot
4
GPTBot
2
Script
2
ByteDance
1
ClaudeBot
1
Search engines
1

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇫🇷 France · 🇩🇪 Germany · 🇸🇬 Singapore