> ## Documentation Index
> Fetch the complete documentation index at: https://docs.sglang.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Welcome to SGLang

> High-performance serving framework for large language and multimodal models.

<a class="github-button" href="https://github.com/sgl-project/sglang" data-size="large" data-show-count="true" aria-label="Star sgl-project/sglang on GitHub">
  Star
</a>

<a class="github-button" href="https://github.com/sgl-project/sglang/fork" data-icon="octicon-repo-forked" data-size="large" data-show-count="true" aria-label="Fork sgl-project/sglang on GitHub">
  Fork
</a>

<script async defer src="https://buttons.github.io/buttons.js" />

<br />

<CardGroup cols={2}>
  <Card title="Performance & Runtime" icon="arrow-trend-up">
    Designed for low-latency, high-throughput inference with RadixAttention, prefix caching, and multi-GPU parallelism.
  </Card>

  <Card title="Models & Ecosystem" icon="hexagon-nodes">
    Broad support for Llama, Qwen, DeepSeek, and more. Compatible with Hugging
    Face and OpenAI APIs.
  </Card>

  <Card title="Extensive Hardware Support" icon="microchip">
    Native support across <a href="./docs/hardware-platforms/overview">Hardware Platforms</a>
    including NVIDIA, AMD, Intel Xeon, Google TPU, and Ascend NPU accelerators.
  </Card>

  <Card title="Community & Training" icon="users">
    Open-source with widespread adoption, powering 400k+ GPUs and integrated with major RL frameworks.
  </Card>
</CardGroup>

SGLang powers large-scale production deployments, generating trillions of tokens each day across more than 400,000 GPUs worldwide. It is hosted under the non-profit open-source organization [LMSYS](https://lmsys.org/about/).

***

## Get Started

SGLang is an inference framework meant for production level serving.
It is designed to deliver low-latency and high-throughput inference across a wide range of setups, from a single GPU to large distributed clusters.

<CardGroup cols={2}>
  <Card title="Install SGLang" icon="angles-down" href="./docs/get-started/install">
    Install SGLang with pip, from source, or via Docker on your preferred hardware platform.
  </Card>

  <Card title="Quickstart" icon="zap" href="./docs/get-started/quickstart">
    Launch your first model server and send requests in minutes with OpenAI-compatible APIs.
  </Card>
</CardGroup>

## News and latest blogs

<div className="not-prose">
  <div
    style={{
  display: "grid",
  gridTemplateColumns: "repeat(auto-fit, minmax(300px, 1fr))",
  gap: "1rem",
  alignItems: "stretch",
}}
  >
    <a
      href="https://lmsys.org/blog/2026-06-17-ling-2-6-tpu/"
      target="_blank"
      rel="noopener noreferrer"
      style={{
    display: "block",
    border: "1px solid rgba(128, 128, 128, 0.3)",
    borderRadius: "0.75rem",
    overflow: "hidden",
    textDecoration: "none",
    color: "inherit",
    height: "100%",
  }}
    >
      <div
        style={{
      aspectRatio: "16 / 9",
      overflow: "hidden",
      background: "rgba(128, 128, 128, 0.15)",
    }}
      >
        <img
          src="https://lmsys.org/images/blog/2026-06-17-ling-2-6-tpu/hero.png"
          alt="Optimizing Ling-2.6-1T on TPU with SGLang-JAX: Hiding MoE Data Movement Behind Compute with One Pallas Kernel"
          style={{
        width: "100%",
        height: "100%",
        objectFit: "cover",
        objectPosition: "center",
        display: "block",
      }}
        />
      </div>

      <div style={{ padding: "0.9rem 1rem 1rem" }}>
        <p
          style={{
        margin: 0,
        fontWeight: 600,
        lineHeight: 1.35,
        fontSize: "0.98rem",
      }}
        >
          {"Optimizing Ling-2.6-1T on TPU with SGLang-JAX: Hiding MoE Data Movement Behind Compute with One Pallas Kernel"}
        </p>

        <p
          style={{
        margin: "0.55rem 0 0",
        fontSize: "0.85rem",
        opacity: 0.75,
      }}
        >
          {"June 17, 2026"}
        </p>
      </div>
    </a>

    <a
      href="https://lmsys.org/blog/2026-06-15-next-generation-speculative-decoding-dflash-v2/"
      target="_blank"
      rel="noopener noreferrer"
      style={{
    display: "block",
    border: "1px solid rgba(128, 128, 128, 0.3)",
    borderRadius: "0.75rem",
    overflow: "hidden",
    textDecoration: "none",
    color: "inherit",
    height: "100%",
  }}
    >
      <div
        style={{
      aspectRatio: "16 / 9",
      overflow: "hidden",
      background: "rgba(128, 128, 128, 0.15)",
    }}
      >
        <img
          src="https://lmsys.org/images/blog/dflash-v2/dflash-arch-diagram.webp"
          alt="The next generation of speculative decoding: DFlash and Spec V2"
          style={{
        width: "100%",
        height: "100%",
        objectFit: "cover",
        objectPosition: "center",
        display: "block",
      }}
        />
      </div>

      <div style={{ padding: "0.9rem 1rem 1rem" }}>
        <p
          style={{
        margin: 0,
        fontWeight: 600,
        lineHeight: 1.35,
        fontSize: "0.98rem",
      }}
        >
          {"The next generation of speculative decoding: DFlash and Spec V2"}
        </p>

        <p
          style={{
        margin: "0.55rem 0 0",
        fontSize: "0.85rem",
        opacity: 0.75,
      }}
        >
          {"June 15, 2026"}
        </p>
      </div>
    </a>

    <a
      href="https://lmsys.org/blog/2026-06-08-lmsys-phd-fellowship/"
      target="_blank"
      rel="noopener noreferrer"
      style={{
    display: "block",
    border: "1px solid rgba(128, 128, 128, 0.3)",
    borderRadius: "0.75rem",
    overflow: "hidden",
    textDecoration: "none",
    color: "inherit",
    height: "100%",
  }}
    >
      <div
        style={{
      aspectRatio: "16 / 9",
      overflow: "hidden",
      background: "rgba(128, 128, 128, 0.15)",
    }}
      >
        <img
          src="https://lmsys.org/images/blog/fellowship_apply/fellowship_program.png"
          alt="Announcing the Recipient of the 2026 LMSYS PhD Fellowship"
          style={{
        width: "100%",
        height: "100%",
        objectFit: "cover",
        objectPosition: "center",
        display: "block",
      }}
        />
      </div>

      <div style={{ padding: "0.9rem 1rem 1rem" }}>
        <p
          style={{
        margin: 0,
        fontWeight: 600,
        lineHeight: 1.35,
        fontSize: "0.98rem",
      }}
        >
          {"Announcing the Recipient of the 2026 LMSYS PhD Fellowship"}
        </p>

        <p
          style={{
        margin: "0.55rem 0 0",
        fontSize: "0.85rem",
        opacity: 0.75,
      }}
        >
          {"June 08, 2026"}
        </p>
      </div>
    </a>

    <a
      href="https://lmsys.org/blog/2026-06-04-nvidia-run-nemotron-3-ultra/"
      target="_blank"
      rel="noopener noreferrer"
      style={{
    display: "block",
    border: "1px solid rgba(128, 128, 128, 0.3)",
    borderRadius: "0.75rem",
    overflow: "hidden",
    textDecoration: "none",
    color: "inherit",
    height: "100%",
  }}
    >
      <div
        style={{
      aspectRatio: "16 / 9",
      overflow: "hidden",
      background: "rgba(128, 128, 128, 0.15)",
    }}
      >
        <img
          src="https://lmsys.org/images/blog/nemotron-3-ultra/image1.png"
          alt="SGLang and Miles Add Day-0 Support for NVIDIA Nemotron 3 Ultra for Long-Running Autonomous Agents"
          style={{
        width: "100%",
        height: "100%",
        objectFit: "cover",
        objectPosition: "center",
        display: "block",
      }}
        />
      </div>

      <div style={{ padding: "0.9rem 1rem 1rem" }}>
        <p
          style={{
        margin: 0,
        fontWeight: 600,
        lineHeight: 1.35,
        fontSize: "0.98rem",
      }}
        >
          {"SGLang and Miles Add Day-0 Support for NVIDIA Nemotron 3 Ultra for Long-Running Autonomous Agents"}
        </p>

        <p
          style={{
        margin: "0.55rem 0 0",
        fontSize: "0.85rem",
        opacity: 0.75,
      }}
        >
          {"June 4, 2026"}
        </p>
      </div>
    </a>

    <a
      href="https://lmsys.org/blog/2026-06-04-higgs-audio-v3-tts/"
      target="_blank"
      rel="noopener noreferrer"
      style={{
    display: "block",
    border: "1px solid rgba(128, 128, 128, 0.3)",
    borderRadius: "0.75rem",
    overflow: "hidden",
    textDecoration: "none",
    color: "inherit",
    height: "100%",
  }}
    >
      <div
        style={{
      aspectRatio: "16 / 9",
      overflow: "hidden",
      background: "rgba(128, 128, 128, 0.15)",
    }}
      >
        <img
          src="https://sgl-project.github.io/sglang-omni/_images/higgs-architecture.png"
          alt="Higgs Audio v3 TTS on SGLang-Omni: Real-Time, Controllable Speech for Voice Agents"
          style={{
        width: "100%",
        height: "100%",
        objectFit: "cover",
        objectPosition: "center",
        display: "block",
      }}
        />
      </div>

      <div style={{ padding: "0.9rem 1rem 1rem" }}>
        <p
          style={{
        margin: 0,
        fontWeight: 600,
        lineHeight: 1.35,
        fontSize: "0.98rem",
      }}
        >
          {"Higgs Audio v3 TTS on SGLang-Omni: Real-Time, Controllable Speech for Voice Agents"}
        </p>

        <p
          style={{
        margin: "0.55rem 0 0",
        fontSize: "0.85rem",
        opacity: 0.75,
      }}
        >
          {"June 4, 2026"}
        </p>
      </div>
    </a>

    <a
      href="https://lmsys.org/blog/2026-06-01-hetero-epd/"
      target="_blank"
      rel="noopener noreferrer"
      style={{
    display: "block",
    border: "1px solid rgba(128, 128, 128, 0.3)",
    borderRadius: "0.75rem",
    overflow: "hidden",
    textDecoration: "none",
    color: "inherit",
    height: "100%",
  }}
    >
      <div
        style={{
      aspectRatio: "16 / 9",
      overflow: "hidden",
      background: "rgba(128, 128, 128, 0.15)",
    }}
      >
        <img
          src="https://lmsys.org/images/blog/hetero-epd/1.png"
          alt="Heterogeneous CPU + GPU EPD Disaggregation to Boost VLM Serving"
          style={{
        width: "100%",
        height: "100%",
        objectFit: "cover",
        objectPosition: "center",
        display: "block",
      }}
        />
      </div>

      <div style={{ padding: "0.9rem 1rem 1rem" }}>
        <p
          style={{
        margin: 0,
        fontWeight: 600,
        lineHeight: 1.35,
        fontSize: "0.98rem",
      }}
        >
          {"Heterogeneous CPU + GPU EPD Disaggregation to Boost VLM Serving"}
        </p>

        <p
          style={{
        margin: "0.55rem 0 0",
        fontSize: "0.85rem",
        opacity: 0.75,
      }}
        >
          {"May 29, 2026"}
        </p>
      </div>
    </a>
  </div>
</div>

***

## Learn more and join the community

<div className="not-prose">
  <div
    style={{
  padding: "0.9rem 0",
  borderTop: "1px solid rgba(128, 128, 128, 0.24)",
  borderBottom: "1px solid rgba(128, 128, 128, 0.24)",
}}
  >
    <p
      style={{
    margin: "0 0 0.35rem",
    fontSize: "0.82rem",
    fontWeight: 700,
    letterSpacing: "0.08em",
    textTransform: "uppercase",
    opacity: 0.72,
  }}
    >
      Stay connected
    </p>

    <div
      style={{
    display: "grid",
    gap: "0.55rem",
    fontSize: "0.97rem",
    lineHeight: 1.7,
  }}
    >
      <div>
        <span style={{ display: "inline-flex", alignItems: "center", verticalAlign: "-0.125em" }}>
          <Icon icon="map" size={14} />
        </span>

        {" "}

        <a href="https://roadmap.sglang.io">Development roadmap</a>
        <span style={{ opacity: 0.62 }}> to follow current priorities and upcoming work.</span>
      </div>

      <div>
        <span style={{ display: "inline-flex", alignItems: "center", verticalAlign: "-0.125em" }}>
          <Icon icon="calendar-days" size={14} />
        </span>

        {" "}

        <a href="https://meet.sglang.io">Weekly public development meeting</a>
        <span style={{ opacity: 0.62 }}> to hear updates and join open discussions.</span>
      </div>

      <div>
        <span style={{ display: "inline-flex", alignItems: "center", verticalAlign: "-0.125em" }}>
          <Icon icon="slack" size={14} />
        </span>

        {" "}

        <a href="https://slack.sglang.io/">Slack</a>
        <span style={{ opacity: 0.62 }}> for questions, feedback, and community support.</span>
      </div>

      <div>
        <a href="https://x.com/lmsysorg">X Twitter</a>
        <span style={{ opacity: 0.62 }}> and </span>

        <span style={{ display: "inline-flex", alignItems: "center", verticalAlign: "-0.125em" }}>
          <Icon icon="linkedin" size={14} />
        </span>

        {" "}

        <a href="https://www.linkedin.com/company/sgl-project/">LinkedIn</a>
        <span style={{ opacity: 0.62 }}> for project updates.</span>
      </div>

      <div>
        <span style={{ display: "inline-flex", alignItems: "center", verticalAlign: "-0.125em" }}>
          <Icon icon="newspaper" size={14} />
        </span>

        {" "}

        <a href="https://lmsys.org/blog/">LMSYS blog</a>
        <span style={{ opacity: 0.62 }}> for release notes, benchmarks, and technical deep dives.</span>
      </div>

      <div>
        <span style={{ display: "inline-flex", alignItems: "center", verticalAlign: "-0.125em" }}>
          <Icon icon="book-open" size={14} />
        </span>

        {" "}

        <a href="https://github.com/sgl-project/sgl-learning-materials">Learning materials</a>
        <span style={{ opacity: 0.62 }}> for blogs, slides, and videos.</span>
      </div>
    </div>
  </div>
</div>
