Mistral Tool Calling: Premature termination on empty tokens

### Issue:

In server.py, the tool-calling logic incorrectly terminates if `gen.text` is an empty string and the model's `tool_call_end` is also "" .

```text
2026-03-27 20:47:58,659 - INFO - KV Caches: 1 seq, 7.06 GB, latest user cache 0 tokens
127.0.0.1 - - [27/Mar/2026 20:47:58] "POST /v1/chat/completions HTTP/1.1" 200 -
2026-03-27 20:47:58,680 - DEBUG - Starting stream:
2026-03-27 20:48:00,370 - INFO - Prompt processing progress: 95/96
2026-03-27 20:48:00,905 - DEBUG - [TOOL_CALLS]
2026-03-27 20:48:01,117 - DEBUG - write
2026-03-27 20:48:01,339 - DEBUG - [ARGS]
2026-03-27 20:48:01,571 - DEBUG - {"
2026-03-27 20:48:01,810 - DEBUG - file
2026-03-27 20:48:02,052 - DEBUG - Path
2026-03-27 20:48:02,299 - DEBUG - ":
2026-03-27 20:48:02,545 - DEBUG -  "/
...
2026-03-27 20:48:09,790 - DEBUG - content
2026-03-27 20:48:10,040 - DEBUG - ":
2026-03-27 20:48:10,290 - DEBUG -  "#
2026-03-27 20:48:10,538 - DEBUG -  Document
2026-03-27 20:48:10,787 - DEBUG -  Loader
2026-03-27 20:48:11,035 - DEBUG -  Module
2026-03-27 20:48:11,283 - DEBUG - \n
2026-03-27 20:48:11,533 - DEBUG - #
...
2026-03-27 20:48:25,284 - DEBUG - import
2026-03-27 20:48:25,547 - DEBUG -  torch
2026-03-27 20:48:25,808 - DEBUG - \n
2026-03-27 20:48:26,075 - DEBUG - \n
2026-03-27 20:48:26,345 - DEBUG - #
2026-03-27 20:48:26,610 - DEBUG -  ---
2026-03-27 20:48:26,871 - DEBUG -
----------------------------------------
Exception occurred during processing of request from ('127.0.0.1', 56974)
Traceback (most recent call last):
...
  packages/mlx_lm/tool_parsers/mistral.py", line 17, in parse_tool_call
    raise ValueError(f"Could not parse tool call from: {text}")
ValueError: Could not parse tool call from: write[ARGS]{"filePath": "/Volumes/Code/transformation-playground/maia-retrieval-layer/statements_retriever/doc_loader.py", "content": "# Document Loader Module\n# This module handles document loading and caching functionality\n\nfrom collections import defaultdict\nfrom dataclasses import dataclass, field\nimport hashlib\nfrom pathlib import Path\nfrom typing import Any, Callable, Dict, List, Protocol, Tuple\nimport torch\n\n# ---

```

```sh
$ pip show mlx-lm | grep "Version"
Version: 0.31.1
```


### Possible cause:
  Streaming detokenizers yield gen.text = "" when buffering tokens for punctuation cleaning or completing multi-byte UTF-8 sequences. The server's equality check if gen.text == ctx.tool_call_end: cannot distinguish between an intermediate buffered token and an actual end-of-call when the end-marker is an empty string.

### Models:
I tried out:
```sh
$ ls -l ~/.cache/huggingface/hub | grep Devstral
drwxr-xr-x@ 5 vitaly  staff  160 Mar 26 13:22 models--mlx-community--Devstral-2-123B-Instruct-2512-4bit
drwxr-xr-x@ 5 vitaly  staff  160 Mar 28 01:16 models--mlx-community--mistralai_Devstral-Small-2-24B-Instruct-2512-MLX-BF16
drwxr-xr-x@ 5 vitaly  staff  160 Mar 28 00:31 models--mlx-community--mistralai_Devstral-Small-2-24B-Instruct-2512-MLX-MXFP4
```

the models were called from mistral-vibe:
```toml
active_model = "devstral-local"

[[providers]]
name = "local_mlx"
api_style = "openai"
api_base = "http://127.0.0.1:8080/v1"
api_key = "dummy_key"
# backend = "generic"

[[models]]
alias = "devstral-local"
provider = "local_mlx"
# name = "mlx-community/Devstral-2-123B-Instruct-2512-4bit"
# name = "mlx-community/mistralai_Devstral-Small-2-24B-Instruct-2512-MLX-MXFP4"
name = "mlx-community/mistralai_Devstral-Small-2-24B-Instruct-2512-MLX-BF16"
```

### Proposed Fix:

Ensure tool_call_end is not "" before performing the equality check in server.py:

```python
            elif in_tool_call:
                # if gen.text == ctx.tool_call_end:  <-- leads to parsing an incomplete tool call
                if ctx.tool_call_end and gen.text == ctx.tool_call_end:
                    tool_calls.append(tool_text)
                    tool_text = ""
                    in_tool_call = False
                else:
                    tool_text += gen.text
```


  This allows intermediate empty strings to correctly fall through to the else block, where they are appended to the tool_text buffer without terminating the call. 
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mistral Tool Calling: Premature termination on empty tokens #1069

Issue:

Possible cause:

Models:

Proposed Fix:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mistral Tool Calling: Premature termination on empty tokens #1069

Description

Issue:

Possible cause:

Models:

Proposed Fix:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions