Skip to main content
This section describes the models supported on the Ascend NPU, including Large Language Models, Multimodal Language Models, Embedding Models, Reward Models and Rerank Models. Mainstream DeepSeek/Qwen/GLM series are included. You are welcome to enable various models based on your business requirements.

Large Language Models

ModelsModel FamilyA2 SupportedA3 Supported
DeepSeek V3/V3.1DeepSeek
vllm-ascend/DeepSeek-V3.2-Exp-W8A8DeepSeek
vllm-ascend/DeepSeek-R1-0528-W8A8DeepSeek
vllm-ascend/DeepSeek-V2-Lite-W8A8DeepSeek
Qwen/Qwen3-30B-A3B-Instruct-2507Qwen
Qwen/Qwen3-32BQwen
Qwen/Qwen3-0.6BQwen
vllm-ascend/Qwen3-235B-A22B-W8A8Qwen
Qwen/Qwen3-Next-80B-A3B-InstructQwen
Qwen3-Coder-480B-A35B-Instruct-w8a8-QuaRotQwen
Qwen/Qwen2.5-7B-InstructQwen
vllm-ascend/QWQ-32B-W8A8Qwen
meta-llama/Llama-4-Scout-17B-16E-InstructLlama
AI-ModelScope/Llama-3.1-8B-InstructLlama
LLM-Research/llama-2-7bLlama
LLM-Research/Llama-3.2-1B-InstructLlama
mistralai/Mistral-7B-Instruct-v0.2Mistral
google/gemma-3-4b-itGemma
microsoft/Phi-4-multimodal-instructPhi
allenai/OLMoE-1B-7B-0924OLMoE
stabilityai/stablelm-2-1_6bStableLM
CohereForAI/c4ai-command-r-v01Command-R
huihui-ai/grok-2Grok
ZhipuAI/chatglm2-6bChatGLM
Shanghai_AI_Laboratory/internlm2-7bInternLM 2
LGAI-EXAONE/EXAONE-3.5-7.8B-InstructExaONE 3
xverse/XVERSE-MoE-A36BXVERSE
HuggingFaceTB/SmolLM-1.7BSmolLM
ZhipuAI/glm-4-9b-chatGLM-4
XiaomiMiMo/MiMo-7B-RLMiMo
arcee-ai/AFM-4.5B-BaseArcee AFM-4.5B
Howeee/persimmon-8b-chatPersimmon
inclusionAI/Ling-liteLing
ibm-granite/granite-3.1-8b-instructGranite
ibm-granite/granite-3.0-3b-a800m-instructGranite MoE
AI-ModelScope/dbrx-instructDBRX (Databricks)
baichuan-inc/Baichuan2-13B-ChatBaichuan 2 (7B, 13B)
baidu/ERNIE-4.5-21B-A3B-PTERNIE-4.5 (4.5, 4.5MoE series)
openbmb/MiniCPM3-4BMiniCPM (v3, 4B)
Kimi/Kimi-K2-ThinkingKimi
openai/gpt-oss-120bGPTOSS

Multimodal Language Models

ModelsModel Family (Variants)A2 SupportedA3 Supported
Qwen/Qwen2.5-VL-3B-InstructQwen-VL
Qwen/Qwen2.5-VL-72B-InstructQwen-VL
Qwen/Qwen3-VL-30B-A3B-InstructQwen-VL
Qwen/Qwen3-VL-8B-InstructQwen-VL
Qwen/Qwen3-VL-4B-InstructQwen-VL
Qwen/Qwen3-VL-235B-A22B-InstructQwen-VL
deepseek-ai/deepseek-vl2DeepSeek-VL2
deepseek-ai/Janus-Pro-1BJanus-Pro (1B, 7B)
deepseek-ai/Janus-Pro-7BJanus-Pro (1B, 7B)
openbmb/MiniCPM-V-2_6MiniCPM-V / MiniCPM-o
openbmb/MiniCPM-o-2_6MiniCPM-V / MiniCPM-o
google/gemma-3-4b-itGemma 3 (Multimodal)
mistralai/Mistral-Small-3.1-24B-Instruct-2503Mistral-Small-3.1-24B
microsoft/Phi-4-multimodal-instructPhi-4-multimodal-instruct
XiaomiMiMo/MiMo-VL-7B-RLMiMo-VL (7B)
AI-ModelScope/llava-v1.6-34bLLaVA (v1.5 & v1.6)
lmms-lab/llava-next-72bLLaVA-NeXT (8B, 72B)
lmms-lab/llava-onevision-qwen2-7b-ovLLaVA-OneVision
Kimi/Kimi-VL-A3B-InstructKimi-VL (A3B)
ZhipuAI/GLM-4.5VGLM-4.5V (106B)
meta-llama/Llama-3.2-11B-Vision-InstructLlama 3.2 Vision (11B)

Embedding Models

ModelsModel FamilyA2 SupportedA3 Supported
intfloat/e5-mistral-7b-instructE5 (Llama/Mistral based)
iic/gte_Qwen2-1.5B-instructGTE-Qwen2
Qwen/Qwen3-Embedding-8BQwen3-Embedding
Alibaba-NLP/gme-Qwen2-VL-2B-InstructGME (Multimodal)
AI-ModelScope/clip-vit-large-patch14-336CLIP
BAAI/bge-large-en-v1.5BGE

Reward Models

ModelsModel FamilyA2 SupportedA3 Supported
Skywork/Skywork-Reward-Llama-3.1-8B-v0.2Llama3.1 Reward
Shanghai_AI_Laboratory/internlm2-7b-rewardInternLM 2 Reward
Qwen/Qwen2.5-Math-RM-72BQwen2.5 Reward - Math
Howeee/Qwen2.5-1.5B-apeachQwen2.5 Reward - Sequence
Skywork/Skywork-Reward-Gemma-2-27B-v0.2Gemma 2-27B Reward

Rerank Models

ModelsModel FamilyA2 SupportedA3 Supported
BAAI/bge-reranker-v2-m3BGE-Reranker