Custom Chat Template

JSON Format
Jinja Format

NOTE: There are two chat template systems in SGLang project. This document is about setting a custom chat template for the OpenAI-compatible API server (defined at conversation.py). It is NOT related to the chat template used in the SGLang language frontend (defined at chat_template.py). By default, the server uses the chat template specified in the model tokenizer from Hugging Face. It should just work for most official models such as Llama-2/Llama-3. If needed, you can also override the chat template when launching the server:

python -m sglang.launch_server \
  --model-path meta-llama/Llama-2-7b-chat-hf \
  --port 30000 \
  --chat-template llama-2

If the chat template you are looking for is missing, you are welcome to contribute it or load it from a file.

JSON Format

You can load the JSON format, which is defined by conversation.py.

{
  "name": "my_model",
  "system": "<|im_start|>system",
  "user": "<|im_start|>user",
  "assistant": "<|im_start|>assistant",
  "sep_style": "CHATML",
  "sep": "<|im_end|>",
  "stop_str": ["<|im_end|>", "<|im_start|>"]
}

python -m sglang.launch_server \
  --model-path meta-llama/Llama-2-7b-chat-hf \
  --port 30000 \
  --chat-template ./my_model_template.json

Jinja Format

You can also use the Jinja template format as defined by Hugging Face Transformers.

python -m sglang.launch_server \
  --model-path meta-llama/Llama-2-7b-chat-hf \
  --port 30000 \
  --chat-template ./my_model_template.jinja

DeepSeekV32-Exp RBG Based PD Deploy SGLang Frontend Language

Getting Started

Basic Usage

Advanced Features

Supported Models

Hardware Platforms

Developer Guide

References

JSON Format

Jinja Format

Getting Started

Basic Usage

Advanced Features

Supported Models

Hardware Platforms

Developer Guide

References

​JSON Format

​Jinja Format

JSON Format

Jinja Format