Support Models on Ascend NPU

This section describes the models supported on the Ascend NPU, including Large Language Models, Multimodal Language Models, Embedding Models, Reward Models and Rerank Models. Mainstream DeepSeek/Qwen/GLM series are included. You are welcome to enable various models based on your business requirements.

Large Language Models

Models	Model Family	A2 Supported	A3 Supported
DeepSeek V3/V3.1	DeepSeek	✅	✅
vllm-ascend/DeepSeek-V3.2-Exp-W8A8	DeepSeek	✅	✅
vllm-ascend/DeepSeek-R1-0528-W8A8	DeepSeek	✅	✅
vllm-ascend/DeepSeek-V2-Lite-W8A8	DeepSeek	✅	✅
Qwen/Qwen3-30B-A3B-Instruct-2507	Qwen	✅	✅
Qwen/Qwen3-32B	Qwen	✅	✅
Qwen/Qwen3-0.6B	Qwen	✅	✅
vllm-ascend/Qwen3-235B-A22B-W8A8	Qwen	✅	✅
Qwen/Qwen3-Next-80B-A3B-Instruct	Qwen	✅	✅
Qwen3-Coder-480B-A35B-Instruct-w8a8-QuaRot	Qwen	✅	✅
Qwen/Qwen2.5-7B-Instruct	Qwen	✅	✅
vllm-ascend/QWQ-32B-W8A8	Qwen	✅	✅
meta-llama/Llama-4-Scout-17B-16E-Instruct	Llama	✅	✅
AI-ModelScope/Llama-3.1-8B-Instruct	Llama	✅	✅
LLM-Research/llama-2-7b	Llama	✅	✅
LLM-Research/Llama-3.2-1B-Instruct	Llama	✅	✅
mistralai/Mistral-7B-Instruct-v0.2	Mistral	✅	✅
google/gemma-3-4b-it	Gemma	✅	✅
microsoft/Phi-4-multimodal-instruct	Phi	✅	✅
allenai/OLMoE-1B-7B-0924	OLMoE	✅	✅
stabilityai/stablelm-2-1_6b	StableLM	✅	✅
CohereForAI/c4ai-command-r-v01	Command-R	✅	✅
huihui-ai/grok-2	Grok	✅	✅
ZhipuAI/chatglm2-6b	ChatGLM	✅	✅
Shanghai_AI_Laboratory/internlm2-7b	InternLM 2	✅	✅
LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct	ExaONE 3	✅	✅
xverse/XVERSE-MoE-A36B	XVERSE	✅	✅
HuggingFaceTB/SmolLM-1.7B	SmolLM	✅	✅
ZhipuAI/glm-4-9b-chat	GLM-4	✅	✅
XiaomiMiMo/MiMo-7B-RL	MiMo	✅	✅
arcee-ai/AFM-4.5B-Base	Arcee AFM-4.5B	✅	✅
Howeee/persimmon-8b-chat	Persimmon	✅	✅
inclusionAI/Ling-lite	Ling	✅	✅
ibm-granite/granite-3.1-8b-instruct	Granite	✅	✅
ibm-granite/granite-3.0-3b-a800m-instruct	Granite MoE	✅	✅
AI-ModelScope/dbrx-instruct	DBRX (Databricks)	✅	✅
baichuan-inc/Baichuan2-13B-Chat	Baichuan 2 (7B, 13B)	✅	✅
baidu/ERNIE-4.5-21B-A3B-PT	ERNIE-4.5 (4.5, 4.5MoE series)	✅	✅
openbmb/MiniCPM3-4B	MiniCPM (v3, 4B)	✅	✅
Kimi/Kimi-K2-Thinking	Kimi	✅	✅
openai/gpt-oss-120b	GPTOSS	✅	✅

Multimodal Language Models

Models	Model Family (Variants)	A2 Supported	A3 Supported
Qwen/Qwen2.5-VL-3B-Instruct	Qwen-VL	✅	✅
Qwen/Qwen2.5-VL-72B-Instruct	Qwen-VL	✅	✅
Qwen/Qwen3-VL-30B-A3B-Instruct	Qwen-VL	✅	✅
Qwen/Qwen3-VL-8B-Instruct	Qwen-VL	✅	✅
Qwen/Qwen3-VL-4B-Instruct	Qwen-VL	✅	✅
Qwen/Qwen3-VL-235B-A22B-Instruct	Qwen-VL	✅	✅
deepseek-ai/deepseek-vl2	DeepSeek-VL2	✅	✅
deepseek-ai/Janus-Pro-1B	Janus-Pro (1B, 7B)	✅	✅
deepseek-ai/Janus-Pro-7B	Janus-Pro (1B, 7B)	✅	✅
openbmb/MiniCPM-V-2_6	MiniCPM-V / MiniCPM-o	✅	✅
openbmb/MiniCPM-o-2_6	MiniCPM-V / MiniCPM-o	✅	✅
google/gemma-3-4b-it	Gemma 3 (Multimodal)	✅	✅
mistralai/Mistral-Small-3.1-24B-Instruct-2503	Mistral-Small-3.1-24B	✅	✅
microsoft/Phi-4-multimodal-instruct	Phi-4-multimodal-instruct	✅	✅
XiaomiMiMo/MiMo-VL-7B-RL	MiMo-VL (7B)	✅	✅
AI-ModelScope/llava-v1.6-34b	LLaVA (v1.5 & v1.6)	✅	✅
lmms-lab/llava-next-72b	LLaVA-NeXT (8B, 72B)	✅	✅
lmms-lab/llava-onevision-qwen2-7b-ov	LLaVA-OneVision	✅	✅
Kimi/Kimi-VL-A3B-Instruct	Kimi-VL (A3B)	✅	✅
ZhipuAI/GLM-4.5V	GLM-4.5V (106B)	✅	✅
meta-llama/Llama-3.2-11B-Vision-Instruct	Llama 3.2 Vision (11B)	✅	✅

Embedding Models

Models	Model Family	A2 Supported	A3 Supported
intfloat/e5-mistral-7b-instruct	E5 (Llama/Mistral based)	✅	✅
iic/gte_Qwen2-1.5B-instruct	GTE-Qwen2	✅	✅
Qwen/Qwen3-Embedding-8B	Qwen3-Embedding	✅	✅
Alibaba-NLP/gme-Qwen2-VL-2B-Instruct	GME (Multimodal)	✅	✅
AI-ModelScope/clip-vit-large-patch14-336	CLIP	✅	✅
BAAI/bge-large-en-v1.5	BGE	✅	✅

Reward Models

Models	Model Family	A2 Supported	A3 Supported
Skywork/Skywork-Reward-Llama-3.1-8B-v0.2	Llama3.1 Reward	✅	✅
Shanghai_AI_Laboratory/internlm2-7b-reward	InternLM 2 Reward	✅	✅
Qwen/Qwen2.5-Math-RM-72B	Qwen2.5 Reward - Math	✅	✅
Howeee/Qwen2.5-1.5B-apeach	Qwen2.5 Reward - Sequence	✅	✅
Skywork/Skywork-Reward-Gemma-2-27B-v0.2	Gemma 2-27B Reward	✅	✅

Rerank Models

Models	Model Family	A2 Supported	A3 Supported
BAAI/bge-reranker-v2-m3	BGE-Reranker	✅	✅

Getting Started

Basic Usage

Advanced Features

Supported Models

Hardware Platforms

Developer Guide

References

Support Models on Ascend NPU

Large Language Models

Multimodal Language Models

Embedding Models

Reward Models

Rerank Models

Getting Started

Basic Usage

Advanced Features

Supported Models

Hardware Platforms

Developer Guide

References

​Large Language Models

​Multimodal Language Models

​Embedding Models

​Reward Models

​Rerank Models

Large Language Models

Multimodal Language Models

Embedding Models

Reward Models

Rerank Models