-
Notifications
You must be signed in to change notification settings - Fork 516
Open
Description
Issue:
In server.py, the tool-calling logic incorrectly terminates if gen.text is an empty string and the model's tool_call_end is also "" .
2026-03-27 20:47:58,659 - INFO - KV Caches: 1 seq, 7.06 GB, latest user cache 0 tokens
127.0.0.1 - - [27/Mar/2026 20:47:58] "POST /v1/chat/completions HTTP/1.1" 200 -
2026-03-27 20:47:58,680 - DEBUG - Starting stream:
2026-03-27 20:48:00,370 - INFO - Prompt processing progress: 95/96
2026-03-27 20:48:00,905 - DEBUG - [TOOL_CALLS]
2026-03-27 20:48:01,117 - DEBUG - write
2026-03-27 20:48:01,339 - DEBUG - [ARGS]
2026-03-27 20:48:01,571 - DEBUG - {"
2026-03-27 20:48:01,810 - DEBUG - file
2026-03-27 20:48:02,052 - DEBUG - Path
2026-03-27 20:48:02,299 - DEBUG - ":
2026-03-27 20:48:02,545 - DEBUG - "/
...
2026-03-27 20:48:09,790 - DEBUG - content
2026-03-27 20:48:10,040 - DEBUG - ":
2026-03-27 20:48:10,290 - DEBUG - "#
2026-03-27 20:48:10,538 - DEBUG - Document
2026-03-27 20:48:10,787 - DEBUG - Loader
2026-03-27 20:48:11,035 - DEBUG - Module
2026-03-27 20:48:11,283 - DEBUG - \n
2026-03-27 20:48:11,533 - DEBUG - #
...
2026-03-27 20:48:25,284 - DEBUG - import
2026-03-27 20:48:25,547 - DEBUG - torch
2026-03-27 20:48:25,808 - DEBUG - \n
2026-03-27 20:48:26,075 - DEBUG - \n
2026-03-27 20:48:26,345 - DEBUG - #
2026-03-27 20:48:26,610 - DEBUG - ---
2026-03-27 20:48:26,871 - DEBUG -
----------------------------------------
Exception occurred during processing of request from ('127.0.0.1', 56974)
Traceback (most recent call last):
...
packages/mlx_lm/tool_parsers/mistral.py", line 17, in parse_tool_call
raise ValueError(f"Could not parse tool call from: {text}")
ValueError: Could not parse tool call from: write[ARGS]{"filePath": "/Volumes/Code/transformation-playground/maia-retrieval-layer/statements_retriever/doc_loader.py", "content": "# Document Loader Module\n# This module handles document loading and caching functionality\n\nfrom collections import defaultdict\nfrom dataclasses import dataclass, field\nimport hashlib\nfrom pathlib import Path\nfrom typing import Any, Callable, Dict, List, Protocol, Tuple\nimport torch\n\n# ---
$ pip show mlx-lm | grep "Version"
Version: 0.31.1Possible cause:
Streaming detokenizers yield gen.text = "" when buffering tokens for punctuation cleaning or completing multi-byte UTF-8 sequences. The server's equality check if gen.text == ctx.tool_call_end: cannot distinguish between an intermediate buffered token and an actual end-of-call when the end-marker is an empty string.
Models:
I tried out:
$ ls -l ~/.cache/huggingface/hub | grep Devstral
drwxr-xr-x@ 5 vitaly staff 160 Mar 26 13:22 models--mlx-community--Devstral-2-123B-Instruct-2512-4bit
drwxr-xr-x@ 5 vitaly staff 160 Mar 28 01:16 models--mlx-community--mistralai_Devstral-Small-2-24B-Instruct-2512-MLX-BF16
drwxr-xr-x@ 5 vitaly staff 160 Mar 28 00:31 models--mlx-community--mistralai_Devstral-Small-2-24B-Instruct-2512-MLX-MXFP4the models were called from mistral-vibe:
active_model = "devstral-local"
[[providers]]
name = "local_mlx"
api_style = "openai"
api_base = "http://127.0.0.1:8080/v1"
api_key = "dummy_key"
# backend = "generic"
[[models]]
alias = "devstral-local"
provider = "local_mlx"
# name = "mlx-community/Devstral-2-123B-Instruct-2512-4bit"
# name = "mlx-community/mistralai_Devstral-Small-2-24B-Instruct-2512-MLX-MXFP4"
name = "mlx-community/mistralai_Devstral-Small-2-24B-Instruct-2512-MLX-BF16"Proposed Fix:
Ensure tool_call_end is not "" before performing the equality check in server.py:
elif in_tool_call:
# if gen.text == ctx.tool_call_end: <-- leads to parsing an incomplete tool call
if ctx.tool_call_end and gen.text == ctx.tool_call_end:
tool_calls.append(tool_text)
tool_text = ""
in_tool_call = False
else:
tool_text += gen.textThis allows intermediate empty strings to correctly fall through to the else block, where they are appended to the tool_text buffer without terminating the call.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels