-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
[Frontend] Add streaming tool-call support to Responses API (non-Harmony) #29726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Frontend] Add streaming tool-call support to Responses API (non-Harmony) #29726
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for streaming tool calls in the Responses API for non-Harmony models, which is a great enhancement. The changes introduce new logic to parse tool calls from the streaming response and emit the correct events, aligning the behavior with Harmony models.
My review focuses on the new parsing logic. I've identified a critical issue in the manual JSON parsing implementation that could lead to incorrect behavior with certain tool call arguments. I've also pointed out an area where error handling could be improved to make debugging easier.
The addition of an end-to-end test for streaming tool calls is a good step towards ensuring correctness.
| def _bracket_level(s: str, opening: str = "{", closing: str = "}") -> int: | ||
| """Calculate the current level of nested brackets in a given string.""" | ||
| level = 0 | ||
| for char in s: | ||
| if char == opening: | ||
| level += 1 | ||
| elif char == closing: | ||
| level -= 1 | ||
| return level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of _bracket_level does not account for JSON string contents. If a string within the JSON contains the opening ({) or closing (}) characters, the bracket level count will be incorrect. This will cause _filter_delta_text to fail, and ultimately _extract_tool_call_required_streaming will produce incorrect output or fail when streaming tool calls.
For example, a tool call with an argument like {"code": "function foo() { return 1; }"} would be misparsed.
This is a critical issue as it can lead to incorrect parsing of tool calls. For robustness, the parsing logic should be made aware of JSON string boundaries. Consider using partial_json_parser more extensively to handle JSON parsing instead of manual bracket counting, as it is already a dependency and is designed to handle such cases robustly.
| try: | ||
| tool_parser = self.tool_parser(tokenizer) | ||
| except Exception: | ||
| logger.exception("Error in tool parser creation.") | ||
| tool_choice_auto = False | ||
| tool_parser = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catching a broad Exception and silently disabling tool_choice_auto can hide important errors from the user. If the tool parser fails to initialize due to a misconfiguration or a bug, the request will unexpectedly fall back to not using tools, which can be difficult to debug. It would be better to log this as an error and potentially fail the request with an informative message, rather than silently changing its behavior.
Purpose
Fix for #29725,
Summary
This pull request fixes an issue where non harmony models using the Responses API with streaming and tools emit only ResponseTextDeltaEvent events, instead of ResponseFunctionCallArgumentsDeltaEvent when a tool call is selected. This prevents clients from reliably detecting and parsing tool call arguments from the stream.
Fix
This change updates the streaming path for non harmony models so that:
When the model selects a tool call, the arguments are surfaced as ResponseFunctionCallArgumentsDeltaEvent instead of plain text deltas. The event structure is now consistent with harmony models and with the non streaming Responses API behavior. With this, clients can treat harmony and non harmony models uniformly when handling tool calls during streaming.
Test Plan
Added a e2e test
Test Result:
For the example mentioned in the issue, the events emitted are as:
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.