{"id":28441,"library":"vllm-omni","title":"vLLM-Omni","description":"vLLM-Omni is a framework for efficient model inference with omni-modality models, built on top of vLLM. It supports speech, image, video, audio, and multimodal generation, aligned with upstream vLLM releases. Current version is 0.20.0, with active development and monthly release cadence.","status":"active","version":"0.20.0","language":"python","source_language":"en","source_url":"https://github.com/vllm-project/vllm-omni","tags":["vllm","multimodal","inference","omni","speech","image","video","audio"],"install":[{"cmd":"pip install vllm-omni","lang":"bash","label":"Standard install"}],"dependencies":[{"reason":"Core inference engine; vllm-omni depends on it and must match versions","package":"vllm","optional":false}],"imports":[{"note":"LLM is re-exported from vLLM, not from vllm-omni directly","wrong":"from vllm_omni import LLM","symbol":"LLM","correct":"from vllm import LLM"},{"note":"Same as LLM, import from vLLM core","wrong":"from vllm_omni import SamplingParams","symbol":"SamplingParams","correct":"from vllm import SamplingParams"},{"note":"From vLLM, not exposed in vllm-omni","wrong":"","symbol":"AsyncLLMEngine","correct":"from vllm.engine.async_llm_engine import AsyncLLMEngine"}],"quickstart":{"code":"from vllm import LLM, SamplingParams\n\n# Load a multimodal model (e.g., Qwen2-VL)\nllm = LLM(model=\"Qwen/Qwen2-VL-7B-Instruct\", trust_remote_code=True)\n\n# Generate with a multimodal prompt\nprompt = {\n    \"role\": \"user\",\n    \"content\": [\n        {\"type\": \"image_url\", \"image_url\": {\"url\": \"https://example.com/image.jpg\"}},\n        {\"type\": \"text\", \"text\": \"Describe this image.\"}\n    ]\n}\nsampling_params = SamplingParams(temperature=0.7, max_tokens=512)\noutputs = llm.generate([prompt], sampling_params)\nprint(outputs[0].outputs[0].text)","lang":"python","description":"Basic inference with a multimodal model using vLLM's LLM interface. Ensure vllm is installed (pip install vllm)."},"warnings":[{"fix":"Install the exact matching vLLM version, typically vllm==<same_version>. E.g., pip install vllm==0.20.0 vllm-omni==0.20.0","message":"vLLM-Omni versions must exactly match the upstream vLLM version they are built against. Using mismatched versions (e.g., vllm-omni 0.20.0 with vllm 0.19.0) will cause import errors or silent failures.","severity":"breaking","affected_versions":"all"},{"fix":"Use import from vllm (e.g., from vllm import LLM) instead of from vllm_omni.","message":"Do not import from vllm_omni directly. All core classes (LLM, SamplingParams, etc.) are re-exported from vllm. Importing from vllm_omni will raise ImportError.","severity":"gotcha","affected_versions":"all"},{"fix":"Use `vllm serve` CLI or the new async engine API.","message":"The old entrypoint vllm.entrypoints.openai.api_server is deprecated. Use vllm serve command-line or vllm.entrypoints.openai.run_batch for batch inference.","severity":"deprecated","affected_versions":">=0.18.0"}],"env_vars":null,"last_verified":"2026-05-09T00:00:00.000Z","next_check":"2026-08-07T00:00:00.000Z","problems":[{"fix":"Install with pip install vllm-omni and import from vllm (e.g., from vllm import LLM).","cause":"Package name on PyPI is vllm-omni (with hyphen), but import uses vllm (no hyphen).","error":"ModuleNotFoundError: No module named 'vllm_omni'"},{"fix":"Use from vllm import LLM (install vllm if needed).","cause":"vllm_omni does not export LLM; it re-exports from vllm.","error":"ImportError: cannot import name 'LLM' from 'vllm_omni'"},{"fix":"Check model list in docs, or add trust_remote_code=True and ensure model is in supported list.","cause":"Model type not yet implemented or requires trust_remote_code=True.","error":"ValueError: The model is not supported by vLLM-Omni"}],"ecosystem":"pypi","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null}