vLLM

library 0.19.0 ·python

✓ verified Jun 28, 2026

vLLM is a high-throughput and memory-efficient inference and serving engine for large language models (LLMs). It utilizes various optimization techniques, such as PagedAttention, to significantly improve LLM serving performance. Currently at version 0.19.0, vLLM maintains a rapid release cadence with frequent updates and new feature additions.

Traffic · last 30 days ↓86% vs prev 7d · indexed Thu Apr 09 · updated Wed Jul 08

total hits 15

actors 4 distinct systems

last hit 7d ago AhrefsBot

GPTBot

ByteDance

Script

Humans

top countries 🇺🇸 United States · 🇸🇬 Singapore · 🇩🇪 Germany · 🇨🇦 Canada · 🇬🇧 United Kingdom

Resources

docsdocs.vllm.ai/en/latest/ ↗

githubgithub.com/vllm-project/vllm ↗

packagepypi.org/project/vllm/ ↗

API endpoints

full doc /v1/registry/vllm

install /v1/registry/vllm/install

compatibility /v1/registry/vllm/compatibility