Example Launch Command
SGLang supports different DLLM algorithms such asLowConfidence and JointThreshold.
Command
Example Configuration File
Depending on the algorithm selected, the configuration parameters vary. LowConfidence Config:Config
Config
Example Client Code Snippet
Just like other supported models, diffusion language models can be used via the REST API or Python client. Python client example for making a generation request to the launched server:Example
Command
Supported Models
Below the supported models are summarized in a table.| Model Family | Example Model | Description |
|---|---|---|
| LLaDA2.0 (mini, flash) | inclusionAI/LLaDA2.0-flash | LLaDA2.0-flash is a diffusion language model featuring a 100B Mixture-of-Experts (MoE) architecture. |
| SDAR (JetLM) | JetLM/SDAR-8B-Chat | SDAR series diffusion language model (Chat), dense architecture. |
| SDAR (JetLM) | JetLM/SDAR-30B-A3B-Chat | SDAR series diffusion language model (Chat), MoE architecture. |
