Python: Bug: Streaming replies error – final ResponseOutputMessage missing .delta on large runs #12296

ltwlf · 2025-05-28T15:51:50Z

Describe the bug
Invoking an AzureResponseAgent that returns a function‑calling answer whose total context grows to many tokens (exact threshold still unclear) causes the run to terminate with

AttributeError: 'ResponseOutputMessage' object has no attribute 'delta'

To Reproduce

Use the latest main branch of Semantic Kernel.
Configure an Azure OpenAI GPT‑4 model that supports tool/function calling (e.g. gpt‑4o).
Create an AzureResponseAgent with at least one function/tool; keep all defaults.
Prompt the agent so that the response invokes the tool and the combined context becomes very large (e.g. ask for a deep recursive JSON structure).
Observe the run end with the AttributeError above.

Expected behavior
The agent run should finish cleanly and deliver the complete assistant message (including tool_calls) without raising an exception, regardless of response length.

Screenshots
N/A – console traceback available upon request.

Platform

Language: Python 3.13.3
Source: main branch
AI model: Azure OpenAI gpt‑4.1
IDE: VS Code
OS: Windows 11

Additional context

Bug has surfaced in two separate apps under active development.
Root cause seems related to OpenAI’s switch from streaming …MessageDeltaEvent objects to a final ResponseOutputMessage (which lacks .delta) for very large tool‑calling completions—but the precise size boundary still needs investigation.

The text was updated successfully, but these errors were encountered:

ltwlf · 2025-05-28T19:59:25Z

Digging a bit deeper

Root cause – Since March 2025 the OpenAI/Azure Responses API appends a terminal object of type ResponseOutputMessage to every streamed run.

This object exposes its payload in .content and deliberately has no .delta.

By contrast, the incremental events SK was written for (ResponseTextDeltaEvent, ResponseMessageDeltaEvent, etc.) all carry their chunk under .delta.

Why we hit it mostly with tool calls – Whenever the model returns a tool_calls array, the SDK is required (per the Assistants spec) to emit the consolidated ResponseOutputMessage so you get a valid, schema-complete assistant message — see the “Final message” note in the official Assistants streaming docs. Plain-text replies may skip that step when the answer is short, which is why non-tool completions often survive.

Trigger pattern – Any long response (many tokens) or any function-calling response, because both reliably produce the final ResponseOutputMessage.

Effect in SK – In responses_agent_thread_actions.py the stream loop still assumes every event owns .delta; when the last object doesn’t, Python raises
AttributeError: 'ResponseOutputMessage' object has no attribute 'delta'.

Impact – Hard stop: the completed assistant message (and any tool_calls) never reaches callers, breaking both user UX and planner logic.

I’m putting together a concise fix and will open a PR shortly. Just wanted to document the investigation so everyone sees it’s a general streaming issue rather than something unique to function calls.

…alls_from_output - Updated type annotation from list[ResponseFunctionToolCall] to list[ResponseOutputItem < /dev/null | ResponseOutputMessage] to accurately reflect actual input types - Improved filtering to only process ResponseFunctionToolCall objects, safely ignoring other types like ResponseOutputMessage - Removed problematic .delta access and dangerous cast operations that caused AttributeError: 'ResponseOutputMessage' object has no attribute 'call_id' Fixes microsoft#12296 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

…alls_from_output (#12301) ## Summary Fixes AttributeError: 'ResponseOutputMessage' object has no attribute 'call_id' in `ResponsesAgentThreadActions._get_tool_calls_from_output` - Updated type annotation from `list[ResponseFunctionToolCall]` to `list[ResponseOutputItem < /dev/null | ResponseOutputMessage]` to accurately reflect actual input types - Improved filtering to only process `ResponseFunctionToolCall` objects, safely ignoring other types like `ResponseOutputMessage` - Removed problematic `.delta` access and dangerous cast operations that caused the AttributeError ## Test plan - [x] All existing OpenAI responses agent tests pass (34/34) - [x] Ruff formatting and linting passes - [x] Pre-commit hooks pass Fixes #12296 Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Evan Mattson <35585003+moonbox3@users.noreply.github.com>

ltwlf added the bug Something isn't working label May 28, 2025

markwallace-microsoft added python Pull requests for the Python Semantic Kernel triage labels May 28, 2025

ltwlf changed the title ~~Python: Bug: Large function‑calling responses with many tokens crash with AttributeError: ResponseOutputMessage has no delta~~ Python: Bug: Streaming replies error – final ResponseOutputMessage missing .delta on large runs May 28, 2025

ltwlf mentioned this issue May 28, 2025

Python: Fix AttributeError in ResponsesAgentThreadActions._get_tool_calls_from_output #12301

Merged

3 tasks

moonbox3 assigned ltwlf May 28, 2025

moonbox3 added agents and removed triage labels May 28, 2025

moonbox3 added this to Semantic Kernel May 28, 2025

moonbox3 closed this as completed in #12301 May 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python: Bug: Streaming replies error – final ResponseOutputMessage missing .delta on large runs #12296

Python: Bug: Streaming replies error – final ResponseOutputMessage missing .delta on large runs #12296

ltwlf commented May 28, 2025 •

edited

Loading

ltwlf commented May 28, 2025 •

edited

Loading

Uh oh!

Python: Bug: Streaming replies error – final ResponseOutputMessage missing .delta on large runs #12296

Python: Bug: Streaming replies error – final ResponseOutputMessage missing .delta on large runs #12296

Comments

ltwlf commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ltwlf commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ltwlf commented May 28, 2025 •

edited

Loading

ltwlf commented May 28, 2025 •

edited

Loading