SGLang

JSON →
0.5.9 verified Wed May 20 auth: no python install: stale

SGLang is a high-performance serving framework for large language models (LLMs) and vision-language models (VLMs), implemented as a domain-specific language embedded in Python. It optimizes LLM inference through advanced techniques like RadixAttention for KV cache reuse, continuous batching, speculative decoding, and various parallelization strategies. The library supports a broad range of models from Hugging Face and offers compatibility with OpenAI APIs. SGLang maintains an active development pace with frequent, often monthly or bi-monthly, releases and is currently at version 0.5.9.

When AI assistants answer questions about this library, they read this page. · indexed since Fri Apr 03

total hits 17
actors 5 distinct systems
last hit 4d ago Bingbot
Amazonbot
6
Search engines
2
Humans
3

top countries 🇺🇸 United States · 🇮🇳 India · 🇫🇷 France · 🇩🇪 Germany · 🇧🇷 Brazil