Enable
Dynamic batching is disabled by default with--batching-max-size 1.
Command
--batching-config /path/to/batching_config.json to load JSON rules when a model or resolution needs a lower cap than --batching-max-size:
Config
Compatibility
An initial implementation of dynamic batching for T2I and T2V models can be found in #18764. The current compatibility grid is below and will be updated as more coverage is added. See Supported Models and Optimization Compatibility for common model IDs and optimization support.✅ means supported, ❌ means not currently supported, ? means untested, and - means not applicable.
Image
| Model | T2I | I2I |
|---|---|---|
| FLUX.1-dev | ✅ | - |
| FLUX.2-dev | ✅ | ❌ |
| FLUX.2-dev-NVFP4 | ? | ? |
| FLUX.2-Klein-4B | ✅ | ❌ |
| FLUX.2-Klein-9B | ? | ? |
| FLUX.2-Klein-Base-4B | ? | ? |
| FLUX.2-Klein-Base-9B | ? | ? |
| Z-Image | ? | - |
| Z-Image-Turbo | ✅ | - |
| GLM-Image | ❌ | - |
| Qwen Image | ✅ | - |
| Qwen Image 2512 | ✅ | - |
| Qwen Image Edit | - | ❌ |
| Qwen Image Edit 2509 | - | ? |
| Qwen Image Edit 2511 | - | ? |
| Qwen Image Layered | ? | ? |
| SD3 Medium | ? | - |
| SD3.5 Medium | ? | - |
| SD3.5 Large | ? | - |
| Hunyuan3D-2 | ? | - |
| SANA 1.5 1.6B | ✅ | - |
| SANA 1.5 4.8B | ✅ | - |
| SANA 1600M 1024px | ? | - |
| SANA 600M 1024px | ? | - |
| SANA 1600M 512px | ? | - |
| SANA 600M 512px | ? | - |
| FireRed-Image-Edit 1.0 | - | ? |
| FireRed-Image-Edit 1.1 | - | ? |
| ERNIE-Image | ? | - |
| ERNIE-Image-Turbo | ? | - |
Video
| Model | Support |
|---|---|
| FastWan2.1 T2V 1.3B | ✅ |
| FastWan2.2 TI2V 5B Full Attn | ❌ |
| Wan2.2 TI2V 5B | ❌ |
| Wan2.2 T2V A14B | ✅ |
| Wan2.2 I2V A14B | ❌ |
| HunyuanVideo | ❌ |
| FastHunyuan | ❌ |
| Wan2.1 T2V 1.3B | ✅ |
| Wan2.1 T2V 14B | ✅ |
| Wan2.1 I2V 480P | ? |
| Wan2.1 I2V 720P | ? |
| TurboWan2.1 T2V 1.3B | ✅ |
| TurboWan2.1 T2V 14B | ✅ |
| TurboWan2.1 T2V 14B 720P | ✅ |
| TurboWan2.2 I2V A14B | ? |
| Wan2.1 Fun 1.3B InP | ? |
| Helios Base | ? |
| Helios Mid | ? |
| Helios Distilled | ? |
| LTX-2 | ? |
| LTX-2.3 | ? |
Notes
- Requests batch only when model inputs, sampling parameters, output handling, and any configured rules are compatible.
- There is no startup probing, runtime learning, OOM retry, or automatic fallback to singletons. If a merged batch fails or cannot be split, every request in that batch receives an error.
- Batch shape can change kernels, so singleton and dynamic outputs are not expected to be bit-exact.
- Use
--enable-batching-metricsto inspect realized batches:
