Stop generation on consecutive duplicate tool calls by chopchop-jiahao · Pull Request #1027 · ml-explore/mlx-lm

chopchop-jiahao · 2026-03-20T08:28:31Z

Summary

When using tool calling with certain models (e.g. Qwen3.5-9B), the model sometimes generates 43–64 identical tool calls in a single response until max_tokens is exhausted. This wastes compute and confuses the agent loop.

This PR adds a lightweight ToolCallDedup class that compares each completed tool call with the previous one. On consecutive duplicate, generation stops early with finish_reason=tool_calls.

Changes:

New mlx_lm/tool_call_dedup.py — simple consecutive duplicate detector (exact string comparison, no JSON normalization)
Integration in server.py — one instance per request, checked at each tool_call_end token
Unit tests covering: first call, different calls, consecutive/non-consecutive duplicates, whitespace sensitivity, logging, state management
Integration tests through the full HTTP pipeline, including a 43-duplicate reproduction of the real-world scenario, both streaming and non-streaming

Design decisions:

Only detects consecutive duplicates (A-A), not alternating patterns (A-B-A-B) — sufficient for observed behavior and avoids false positives
Exact text comparison — conservative, no risk of normalizing away meaningful differences
No configuration flag — the behavior is strictly beneficial (stops degenerate loops)
Per-request instance — no thread safety concerns

Closes #613

Models like Qwen3.5-9B generate 43-64 identical tool calls in a single response until max_tokens, wasting compute and confusing agent loops. Add ToolCallDedup: compares each completed tool call with the previous one during generation. On consecutive duplicate, stop with finish_reason=tool_calls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chopchop-jiahao and others added 2 commits March 20, 2026 08:23

fix: remove trailing comma in test method signature

1c4262f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chopchop-jiahao marked this pull request as draft March 20, 2026 08:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop generation on consecutive duplicate tool calls#1027

Stop generation on consecutive duplicate tool calls#1027
chopchop-jiahao wants to merge 2 commits intoml-explore:mainfrom
chopchop-jiahao:feat/tool-call-dedup

chopchop-jiahao commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chopchop-jiahao commented Mar 20, 2026

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant