Support Models on Ascend NPU#
This section describes the models supported on the Ascend NPU, including Large Language Models, Multimodal Language Models, Embedding Models, Reward Models and Rerank Models. Mainstream DeepSeek/Qwen/GLM series are included. You are welcome to enable various models based on your business requirements.
Large Language Models#
Models |
Model Family |
A2 Supported |
A3 Supported |
|---|---|---|---|
DeepSeek V3/V3.1 |
DeepSeek |
√ |
√ |
vllm-ascend/DeepSeek-V3.2-Exp-W8A8 |
DeepSeek |
√ |
√ |
vllm-ascend/DeepSeek-R1-0528-W8A8 |
DeepSeek |
√ |
√ |
vllm-ascend/DeepSeek-V2-Lite-W8A8 |
DeepSeek |
√ |
√ |
Qwen/Qwen3-30B-A3B-Instruct-2507 |
Qwen |
√ |
√ |
Qwen/Qwen3-32B |
Qwen |
√ |
√ |
Qwen/Qwen3-0.6B |
Qwen |
√ |
√ |
vllm-ascend/Qwen3-235B-A22B-W8A8 |
Qwen |
√ |
√ |
Qwen/Qwen3-Next-80B-A3B-Instruct |
Qwen |
√ |
√ |
Qwen3-Coder-480B-A35B-Instruct-w8a8-QuaRot |
Qwen |
√ |
√ |
Qwen/Qwen2.5-7B-Instruct |
Qwen |
√ |
√ |
vllm-ascend/QWQ-32B-W8A8 |
Qwen |
√ |
√ |
meta-llama/Llama-4-Scout-17B-16E-Instruct |
Llama |
√ |
√ |
AI-ModelScope/Llama-3.1-8B-Instruct |
Llama |
√ |
√ |
LLM-Research/Llama-3.2-1B-Instruct |
Llama |
√ |
√ |
mistralai/Mistral-7B-Instruct-v0.2 |
Mistral |
√ |
√ |
google/gemma-3-4b-it |
Gemma |
√ |
√ |
microsoft/Phi-4-multimodal-instruct |
Phi |
√ |
√ |
allenai/OLMoE-1B-7B-0924 |
OLMoE |
√ |
√ |
stabilityai/stablelm-2-1_6b |
StableLM |
√ |
√ |
CohereForAI/c4ai-command-r-v01 |
Command-R |
√ |
√ |
huihui-ai/grok-2 |
Grok |
√ |
√ |
ZhipuAI/chatglm2-6b |
ChatGLM |
√ |
√ |
Shanghai_AI_Laboratory/internlm2-7b |
InternLM 2 |
√ |
√ |
LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct |
ExaONE 3 |
√ |
√ |
xverse/XVERSE-MoE-A36B |
XVERSE |
√ |
√ |
HuggingFaceTB/SmolLM-1.7B |
SmolLM |
√ |
√ |
ZhipuAI/glm-4-9b-chat |
GLM-4 |
√ |
√ |
XiaomiMiMo/MiMo-7B-RL |
MiMo |
√ |
√ |
arcee-ai/AFM-4.5B-Base |
Arcee AFM-4.5B |
√ |
√ |
Howeee/persimmon-8b-chat |
Persimmon |
√ |
√ |
inclusionAI/Ling-lite |
Ling |
√ |
√ |
ibm-granite/granite-3.1-8b-instruct |
Granite |
√ |
√ |
ibm-granite/granite-3.0-3b-a800m-instruct |
Granite MoE |
√ |
√ |
databricks/dbrx-instruct |
DBRX (Databricks) |
√ |
√ |
baichuan-inc/Baichuan2-13B-Chat |
Baichuan 2 (7B, 13B) |
√ |
√ |
baidu/ERNIE-4.5-21B-A3B-PT |
ERNIE-4.5 (4.5, 4.5MoE series) |
√ |
√ |
openbmb/MiniCPM3-4B |
MiniCPM (v3, 4B) |
√ |
√ |
openai/gpt-oss-120b |
GPTOSS |
× |
× |
Multimodal Language Models#
Models |
Model Family (Variants) |
A2 Supported |
A3 Supported |
|---|---|---|---|
Qwen/Qwen2.5-VL-3B-Instruct |
Qwen-VL |
√ |
√ |
Qwen/Qwen2.5-VL-72B-Instruct |
Qwen-VL |
√ |
√ |
Qwen/Qwen3-VL-30B-A3B-Instruct |
Qwen-VL |
√ |
√ |
Qwen/Qwen3-VL-8B-Instruct |
Qwen-VL |
√ |
√ |
Qwen/Qwen3-VL-4B-Instruct |
Qwen-VL |
√ |
√ |
Qwen/Qwen3-VL-235B-A22B-Instruct |
Qwen-VL |
√ |
√ |
deepseek-ai/deepseek-vl2 |
DeepSeek-VL2 |
√ |
√ |
deepseek-ai/Janus-Pro-7B |
Janus-Pro (1B, 7B) |
√ |
√ |
openbmb/MiniCPM-V-2_6 |
MiniCPM-V / MiniCPM-o |
√ |
√ |
google/gemma-3-4b-it |
Gemma 3 (Multimodal) |
√ |
√ |
mistralai/Mistral-Small-3.1-24B-Instruct-2503 |
Mistral-Small-3.1-24B |
√ |
√ |
microsoft/Phi-4-multimodal-instruct |
Phi-4-multimodal-instruct |
√ |
√ |
XiaomiMiMo/MiMo-VL-7B-RL |
MiMo-VL (7B) |
√ |
√ |
AI-ModelScope/llava-v1.6-34b |
LLaVA (v1.5 & v1.6) |
√ |
√ |
lmms-lab/llava-next-72b |
LLaVA-NeXT (8B, 72B) |
√ |
√ |
lmms-lab/llava-onevision-qwen2-7b-ov |
LLaVA-OneVision |
√ |
√ |
Kimi/Kimi-VL-A3B-Instruct |
Kimi-VL (A3B) |
√ |
√ |
ZhipuAI/GLM-4.5V |
GLM-4.5V (106B) |
√ |
√ |
meta-llama/Llama-3.2-11B-Vision-Instruct |
Llama 3.2 Vision (11B) |
× |
× |
Embedding Models#
Models |
Model Family |
A2 Supported |
A3 Supported |
|---|---|---|---|
intfloat/e5-mistral-7b-instruct |
E5 (Llama/Mistral based) |
√ |
√ |
iic/gte_Qwen2-1.5B-instruct |
GTE-Qwen2 |
√ |
√ |
Qwen/Qwen3-Embedding-8B |
Qwen3-Embedding |
√ |
√ |
Alibaba-NLP/gme-Qwen2-VL-2B-Instruct |
GME (Multimodal) |
√ |
√ |
AI-ModelScope/clip-vit-large-patch14-336 |
CLIP |
√ |
√ |
BAAI/bge-large-en-v1.5 |
BGE |
× |
× |
Reward Models#
Models |
Model Family |
A2 Supported |
A3 Supported |
|---|---|---|---|
Skywork/Skywork-Reward-Llama-3.1-8B-v0.2 |
Llama3.1 Reward |
√ |
√ |
Shanghai_AI_Laboratory/internlm2-7b-reward |
InternLM 2 Reward |
√ |
√ |
Qwen/Qwen2.5-Math-RM-72B |
Qwen2.5 Reward - Math |
√ |
√ |
jason9693/Qwen2.5-1.5B-apeach |
Qwen2.5 Reward - Sequence |
√ |
√ |
Skywork/Skywork-Reward-Gemma-2-27B-v0.2 |
Gemma 2-27B Reward |
× |
× |
Rerank Models#
Models |
Model Family |
A2 Supported |
A3 Supported |
|---|---|---|---|
BAAI/bge-reranker-v2-m3 |
BGE-Reranker |
√ |
√ |