SGLang QuickStart: Install, Configure, and Serve LLMs via OpenAI API
Serve open models fast with SGLang.
SGLang is a high-performance serving framework for large language models and multimodal models, built to deliver low-latency and high-throughput inference across everything from a single GPU to distributed clusters.