π― What Youβll Find Here
This cookbook aggregates battle-tested SGLang recipes covering:- Models: Mainstream LLMs and Vision-Language Models (VLMs)
- Use Cases: Inference serving, deployment strategies, multimodal applications
- Hardware: GPU and CPU configurations, optimization for different accelerators
- Best Practices: Configuration templates, performance tuning, troubleshooting guides
Guides
Autoregressive Models
Qwen
DeepSeek
Llama
GLM
OpenAI
Moonshotai
MiniMax
NVIDIA
Ernie
InternVL
InternLM
Jina AI
Mistral
Xiaomi
FlashLabs
- Chroma 1.0NEW
Diffusion Models
FLUX
Qwen-Image
Wan
Z-Image
Benchmarks
Reference
- Server arguments - Understanding all the arguments
π Quick Start
- Browse the recipe index above to find your model
- Follow the step-by-step instructions in each guide
- Adapt configurations to your specific hardware and requirements
- Join our community to share feedback and improvements
π€ Contributing
We believe the best documentation comes from practitioners. Whether youβve optimized SGLang for a specific model, solved a tricky deployment challenge, or discovered performance improvements, we encourage you to contribute your recipes! Ways to contribute:- Add a new recipe for a model not yet covered
- Improve existing recipes with additional tips or configurations
- Report issues or suggest enhancements
- Share your production deployment experiences
π οΈ Local Development
Prerequisites
- Node.js >= 18.0
- Mintlify CLI
Setup and Run
Install the Mintlify CLI and start the development server:http://localhost:3000.
π Resources
π License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.Letβs build this resource together! π Star the repo and contribute your recipes to help the SGLang community grow.
