Supported Models & Parsers
| Model | Reasoning tags | Parser | Notes |
|---|---|---|---|
| DeepSeek‑R1 series | <think> … </think> | deepseek-r1 | Supports all variants (R1, R1-0528, R1-Distill) |
| DeepSeek‑V3 series | <think> … </think> | deepseek-v3 | Including DeepSeek‑V3.2. Supports thinking parameter |
| Standard Qwen3 models | <think> … </think> | qwen3 | Supports enable_thinking parameter |
| Qwen3-Thinking models | <think> … </think> | qwen3 or qwen3-thinking | Always generates thinking content |
| Kimi K2 Thinking | ◁think▷ … ◁/think▷ | kimi_k2 | Uses special thinking delimiters. Also requires --tool-call-parser kimi_k2 for tool use. |
| GPT OSS | <|channel|>analysis<|message|> … <|end|> | gpt-oss | N/A |
Model-Specific Behaviors
DeepSeek-R1 Family:- DeepSeek-R1: No
<think>start tag, jumps directly to thinking content - DeepSeek-R1-0528: Generates both
<think>start and</think>end tags - Both are handled by the same
deepseek-r1parser
- DeepSeek-V3.1/V3.2: Hybrid model supporting both thinking and non-thinking modes, use the
deepseek-v3parser andthinkingparameter (NOTE: notenable_thinking)
- Standard Qwen3 (e.g., Qwen3-2507): Use
qwen3parser, supportsenable_thinkingin chat templates - Qwen3-Thinking (e.g., Qwen3-235B-A22B-Thinking-2507): Use
qwen3orqwen3-thinkingparser, always thinks
- Kimi K2 Thinking: Uses special
◁think▷and◁/think▷tags. For agentic tool use, also specify--tool-call-parser kimi_k2.
- GPT OSS: Uses special
<|channel|>analysis<|message|>and<|end|>tags
Usage
Launching the Server
Specify the--reasoning-parser option.
--reasoning-parser defines the parser used to interpret responses.
OpenAI Compatible API
Using the OpenAI compatible API, the contract follows the DeepSeek API design established with the release of DeepSeek-R1:reasoning_content: The content of the CoT.content: The content of the final answer.
Non-Streaming Request
Streaming Request
separate_reasoning option to False in request.
SGLang Native API
Offline Engine API
Supporting New Reasoning Model Schemas
For future reasoning models, you can implement the reasoning parser as a subclass ofBaseReasoningFormatDetector in python/sglang/srt/reasoning_parser.py and specify the reasoning parser for new reasoning model schemas accordingly.