ONNX Runtime GenAI

library 0.13.1 ·python

✓ verified May 24, 2026

ONNX Runtime GenAI is a Python library that provides an easy, flexible, and performant way to run Generative AI models (Large Language Models and multi-modal models) on-device and in the cloud using ONNX Runtime. It encapsulates the complete generative AI loop, including pre- and post-processing, inference with ONNX Runtime, logits processing, search and sampling, and KV cache management. The library is actively developed, with version 0.13.1 released in April 2026, generally following a quarterly release cadence in line with the broader ONNX Runtime project.

Traffic · last 30 days ↑40% vs prev 7d · indexed Thu Apr 16 · updated Mon Jun 01

total hits 17

actors 8 distinct systems

last hit 22h ago AhrefsBot

MetaBot

GPTBot

Script

ByteDance

ClaudeBot

Search engines

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇫🇷 France · 🇩🇪 Germany · 🇸🇬 Singapore

Resources

packagepypi.org/project/onnxruntime-genai/ ↗

homepageonnxruntime.ai ↗

API endpoints

full doc /v1/registry/onnxruntime-genai

install /v1/registry/onnxruntime-genai/install

compatibility /v1/registry/onnxruntime-genai/compatibility