[Bug]: 在4卡 V100上运行qwen3.5-27B-AWQ，输出全部为!

### Your current environment

<details>
<summary>The output of <code>python collect_env.py</code></summary>

```text
Your output of `python collect_env.py` here
```

</details>


### 🐛 Describe the bug

如题，使用教程中的构建步骤构建docker镜像，以下是我的启动参数：
--model /home/models/Qwen3.5-27B-AWQ
      --served_model_name Qwen3.5-27B
      --quantization awq
      --tensor-parallel-size 4
      --dtype float16
      --gpu-memory-utilization 0.90
      --max-model-len 262144
      --max-num-seqs 4
      --max-num-batched-tokens 16384
      --attention-backend FLASH_ATTN_V100
      --enable-auto-tool-choice 
      --tool-call-parser qwen3_coder
      --default-chat-template-kwargs '{"enable_thinking": false}'
      --skip-mm-profiling
      --enable-prefix-caching

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: 在4卡 V100上运行qwen3.5-27B-AWQ，输出全部为! #4

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Bug]: 在4卡 V100上运行qwen3.5-27B-AWQ，输出全部为! #4

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions