Your current environment
The output of python collect_env.py
Your output of `python collect_env.py` here
🐛 Describe the bug
如题,使用教程中的构建步骤构建docker镜像,以下是我的启动参数:
--model /home/models/Qwen3.5-27B-AWQ
--served_model_name Qwen3.5-27B
--quantization awq
--tensor-parallel-size 4
--dtype float16
--gpu-memory-utilization 0.90
--max-model-len 262144
--max-num-seqs 4
--max-num-batched-tokens 16384
--attention-backend FLASH_ATTN_V100
--enable-auto-tool-choice
--tool-call-parser qwen3_coder
--default-chat-template-kwargs '{"enable_thinking": false}'
--skip-mm-profiling
--enable-prefix-caching
Before submitting a new issue...
Your current environment
The output of
python collect_env.py🐛 Describe the bug
如题,使用教程中的构建步骤构建docker镜像,以下是我的启动参数:
--model /home/models/Qwen3.5-27B-AWQ
--served_model_name Qwen3.5-27B
--quantization awq
--tensor-parallel-size 4
--dtype float16
--gpu-memory-utilization 0.90
--max-model-len 262144
--max-num-seqs 4
--max-num-batched-tokens 16384
--attention-backend FLASH_ATTN_V100
--enable-auto-tool-choice
--tool-call-parser qwen3_coder
--default-chat-template-kwargs '{"enable_thinking": false}'
--skip-mm-profiling
--enable-prefix-caching
Before submitting a new issue...