feat(chat): MCP tool call loop in built-in chat UI#377
feat(chat): MCP tool call loop in built-in chat UI#377rayone wants to merge 4 commits intojundot:mainfrom
Conversation
The chat UI's streamResponse function did not handle finish_reason: tool_calls — when a model returned a tool call the response was silently dropped, leaving the user with no output. Changes: - Accumulate delta.tool_calls chunks during streaming - On finish_reason tool_calls: push hidden assistant message, execute all tools in parallel via /v1/mcp/execute, push hidden tool result messages, then recurse into streamResponse so the model can stream its final answer with full tool context - Show a per-tool wrench indicator while execution is in flight - Hide bookkeeping assistant/tool messages from the UI (msg._ui flag) - Simplify messagesForApi to a single filter+map pass
|
Note re: #286 This PR and #286 are orthogonal:
Both are needed. If #286 merges first it would also allow simplifying |
…lation Tests the two core logic contracts: - messagesForApi filtering: excludes tool_call indicators, preserves tool_calls/tool_call_id fields, leaves normal conversation unchanged - Streaming tool_call chunk accumulation: single call, parallel calls, no calls, missing index defaults to 0
jundot
left a comment
There was a problem hiding this comment.
Thanks for this, really useful addition. The tool call loop was definitely missing from the chat UI.
Two things before merging:
1. Recursion depth limit
streamResponse() recurses on finish_reason: tool_calls with no max depth. If the model keeps returning tool calls it loops forever. Adding a depth parameter with a cap (10 or so) would fix it.
2. Abort guard before recursion
If the user clicks Stop during tool execution, errors get caught but the code still recurses into a new stream. A quick if (this.abortController?.signal.aborted) return; before the recursive call would respect the user's intent.
187e87b to
dfc5b20
Compare
…ling Addresses both review requests from jundot: 1. Recursion depth limit: add depth param to streamResponse() with MAX_TOOL_DEPTH = 10 constant. Exceeding the limit pushes an error message and fully resets streaming state (isStreaming, streamingContent, abortController) before returning — the early return is before the try block so finally does not run. 2. Abort guard before recursion: check abortController.signal.aborted after tool execution completes and bail out before the recursive call. Additional improvements beyond the review requests: - 30s per-tool timeout via AbortSignal.timeout() combined with the user abort signal using AbortSignal.any(), with distinct TimeoutError message - HTTP error detection on /v1/mcp/execute responses (non-ok status) - Error indicators: failed tools show red wrench + 'toolName failed' for 2s before being removed, so the user sees what went wrong - MAX_TOOL_DEPTH and TOOL_TIMEOUT_MS moved to top-level chatApp() data properties for discoverability and easier tuning Tests: extend TestChatToolCallSafety with 11 new cases covering depth boundary conditions, result format contracts (success/error/timeout/HTTP), error indicator format, API filtering of error indicators, and abort guard logic (including None controller safety).
…vity Alpine 3 only tracks properties that exist on an object at creation time. Adding _error to an indicator after push (indicator._error = true) would not trigger a re-render because Alpine never set up a dependency for it. Fix: include _error: false in the initial push so Alpine tracks the property from creation and correctly re-renders when it flips to true.
Problem
The built-in chat UI (
/admin/chat) silently drops responses when a model decides to call an MCP tool.streamResponseonly handleddelta.contentanddelta.reasoning_content— when the model returnsfinish_reason: tool_callsthere is nodelta.content, sostreamingContentstays empty and nothing is pushed to the message list.Solution
delta.tool_callschunks during streaming (OpenAI chunked format, keyed byindex)finish_reason: tool_calls: push a hiddenassistantmessage withtool_callsinto history, execute all tools in parallel viaPOST /v1/mcp/execute, push hiddentoolresult messages, then recurse back intostreamResponseso the model receives full tool context and streams a final answertoolName…) in the UI while execution is in flightassistant/toolprotocol messages from display using_ui: false/x-show="msg._ui !== false"messagesForApito a single filter+map pass (was two passes with redundant filter)Tested with
Tavily MCP + Qwen3.5-27B on macOS (Apple Silicon)