Fix per-request enable_thinking toggle in server by eyupcanakman · Pull Request #1030 · ml-explore/mlx-lm

eyupcanakman · 2026-03-20T15:14:14Z

fixes #914.

PR #829 added chat_template_kwargs support so clients can send enable_thinking per request, but has_thinking was still read from the static tokenizer property. The per-request value was applied to the chat template but ignored during response handling.

Added _has_thinking() that checks per-request kwargs first, then CLI --chat-template-args, then falls back to tokenizer.has_thinking. Both batch and single generation paths use it.

The chat_template_kwargs from client requests (e.g. enable_thinking) were applied to the chat template during tokenization but ignored when building the GenerationContext. This meant the response handler always used the static tokenizer.has_thinking flag, so reasoning detection and prompt checkpointing could not be toggled per request. Add _has_thinking() that resolves the effective thinking state from per-request kwargs, CLI --chat-template-args, then the tokenizer default. Use it in both generation paths and prompt checkpointing. Fixes ml-explore#914

Thump604 · 2026-03-21T20:45:08Z

The priority chain (per-request kwargs > CLI args > tokenizer default) is the right design and fixes a real bug — setting enable_thinking: false per-request currently still triggers think-token stripping.

Heads up: this conflicts with PR #1006 which also modifies _compute_prompt_checkpoint (same function signature, both add args parameter). Whichever lands first, the other needs a rebase.

Thump604 mentioned this pull request Mar 21, 2026

Enable prefix cache reuse for hybrid models (Qwen3.5, Nemotron-H, Mamba) #1006

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix per-request enable_thinking toggle in server#1030

Fix per-request enable_thinking toggle in server#1030
eyupcanakman wants to merge 1 commit intoml-explore:mainfrom
eyupcanakman:fix/per-request-thinking-toggle-914

eyupcanakman commented Mar 20, 2026

Uh oh!

Thump604 commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eyupcanakman commented Mar 20, 2026

Uh oh!

Thump604 commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants