NOTE: There are two chat template systems in SGLang project. This document is about setting a custom chat template for the OpenAI-compatible API server (defined at conversation.py). It is NOT related to the chat template used in the SGLang language frontend (defined at chat_template.py).By default, the server uses the chat template specified in the model tokenizer from Hugging Face.
It should just work for most official models such as Llama-2/Llama-3.If needed, you can also override the chat template when launching the server: