SGLang Router

0.3.2 · active · verified Wed Apr 15

SGLang Router (PyPI: `sglang-router`, current version 0.3.2) is a high-performance, Rust-based load balancer designed for SGLang instances, facilitating data parallelism and advanced request routing. It supports multiple load balancing algorithms, including cache-aware, power of two, random, and round robin, and is specialized for prefill-decode disaggregated serving architectures. The project has been evolving into the 'SGLang Model Gateway', aiming to become a full OpenAI API server with features like native tool calling and session management. It maintains an active development pace with frequent updates and bug fixes.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to programmatically initialize the SGLang Router by providing a list of SGLang worker URLs. The router acts as a load balancer for these workers. For actual deployment and to handle incoming requests, the router is often launched as a separate process using the `python -m sglang_router.launch_router` command, or integrated into an ASGI application. Ensure that SGLang worker instances are running and accessible at the specified URLs for the router to function correctly.

import os
from sglang_router import Router

# NOTE: This quickstart assumes SGLang worker instances are running
# at the specified URLs (e.g., http://localhost:8000).
# Replace with actual worker URLs if available.
worker_urls = [
    os.environ.get('SGLANG_WORKER_URL_1', 'http://localhost:8000'),
    os.environ.get('SGLANG_WORKER_URL_2', 'http://localhost:8001')
]

try:
    # Initialize the SGLang Router
    # By default, it runs in regular HTTP routing mode.
    router = Router(worker_urls=worker_urls)
    print(f"SGLang Router initialized with workers: {worker_urls}")

    # In a real application, you would typically start the router
    # (e.g., in a separate thread or process) and then send requests to it.
    # For demonstration, we just show initialization.

    # Example of running the router process (requires a running event loop or main function)
    # This part is conceptual as `Router` doesn't expose a simple `run()` method directly
    # in this programmatic interface; it's often launched via `python -m`.
    print("Router instance created. To run, typically use 'python -m sglang_router.launch_router'\n"+
          "or integrate into an ASGI app. Refer to SGLang documentation for full deployment.")

except Exception as e:
    print(f"Error initializing SGLang Router: {e}")
    print("Ensure SGLang worker instances are running and accessible at the provided URLs.")

view raw JSON →