@vllm_project
π vLLM now offers an optimized inference recipe for DeepSeek-V3.2. βοΈ Startup details Run vLLM with DeepSeek-specific components: --tokenizer-mode deepseek_v32 \ --tool-call-parser deepseek_v32 π§° Usage tips Enable thinking mode in vLLM: β extra_body={"chat_template_kwargs":{"thinking": True}} β Use reasoning instead of reasoning_content π Special thanks to @TencentCloud for compute and engineering support. π Full recipe (including how to properly use the thinking with tool calls feature): https://t.co/NgSiQz7sQZ #vLLM #DeepSeek #Inference #ToolCalling #OpenSource