Skip to content

fix: ensure final assistant answer renders immediately in existing threads#2

Open
thatdaveguy1 wants to merge 1 commit intomainfrom
fix/final-answer-render-race
Open

fix: ensure final assistant answer renders immediately in existing threads#2
thatdaveguy1 wants to merge 1 commit intomainfrom
fix/final-answer-render-race

Conversation

@thatdaveguy1
Copy link
Copy Markdown
Owner

Summary

This PR fixes a real race observed in existing threads where a turn can complete before the final assistant text appears in the UI.

What changed

  • Server provider fix (codex-app-server-agent.ts)
    • Flush buffered assistant message text before emitting turn_completed for successful turns.
  • Client stream merge fix (stream.ts)
    • On completion flush, replace stale same-ID tail items with finalized head items instead of skipping by ID-only dedupe.

Tests

  • Added provider regression test for early task_complete with buffered assistant delta.
  • Added stream regression test to verify stale tail content is replaced by finalized content on completion.

Why this matters

Together these fixes prevent the "turn completed but final answer missing/partial" behavior in existing threads.

Verification

  • Focused regression tests pass in the source workspace where the fixes were developed.
  • Live paired-thread verification confirmed immediate final-answer rendering behavior with GPT-5 mini constrained test turns.

Ensure Codex provider flushes buffered assistant deltas before turn completion and make stream completion merge replace stale same-id tail items with finalized head content. Includes focused regressions for both races.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 7, 2026 05:01
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to eliminate a race where a turn can complete before the final assistant text is rendered, by flushing buffered assistant deltas before completion on the server and by improving client-side head→tail stream merging to replace stale same-ID items.

Changes:

  • Server: flush buffered assistant message text before emitting turn_completed for successful Codex app-server turns.
  • Client: on completion flush, replace existing tail items with finalized head items when IDs match (instead of ID-only dedupe).
  • Adds/updates Codex provider behaviors (plan/question mapping, feature handling, diagnostics, Windows spawn quoting) and regression tests around the race and stream merge behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
packages/server/src/server/agent/providers/codex-app-server-agent.ts Flush logic for buffered assistant deltas + significant additional provider behavior changes.
packages/server/src/server/agent/providers/codex-app-server-agent.test.ts Adds regression tests for buffered flush and new plan/question mappings.
packages/app/src/types/stream.ts Updates head→tail flush logic to replace stale tail items on matching IDs.
packages/app/src/types/stream-event.test.ts Adds regression test ensuring stale tail content is replaced on completion.
Comments suppressed due to low confidence (1)

packages/server/src/server/agent/providers/codex-app-server-agent.ts:88

  • CODEX_MODES no longer includes the "read-only" mode, but this provider still supports/validates it via MODE_PRESETS + validateCodexMode, and the server mode mappings/manifests elsewhere reference "read-only" as a valid/default mode. As-is, callers can set/read "read-only" but getAvailableModes() will never advertise it. Suggest adding the "read-only" entry back into CODEX_MODES (or fully removing support for it across presets/mappings/manifests and updating defaults accordingly).
const CODEX_MODES: AgentMode[] = [
  {
    id: "auto",
    label: "Default Permissions",
    description: "Edit files and run commands with Codex's default approval flow.",
  },
  {
    id: "full-access",
    label: "Full Access",
    description: "Edit files, run commands, and access the network without additional prompts.",
  },
];

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +48 to +52
import {
findExecutable,
quoteWindowsArgument,
quoteWindowsCommand,
} from "../../../utils/executable.js";
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new import from "../../../utils/executable.js" appears to be invalid: there is no corresponding module in the repo, and quoteWindowsCommand / quoteWindowsArgument are not defined anywhere (search only finds them in this file). This will fail build/runtime resolution. Suggest importing findExecutable from "../provider-launch-config.js" (where it is exported today) and either add the missing quote helper exports (in an existing module) or revert the quoting change until those helpers exist.

Copilot uses AI. Check for mistakes.
Comment on lines 2658 to 2662
if (event.item.type === "assistant_message") {
finalText = event.item.text;
} else if (event.item.type === "tool_call" && event.item.detail.type === "plan") {
finalText = event.item.detail.text;
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run() now overwrites finalText when a plan tool call is emitted. finalText is consumed by higher-level orchestrators as the assistant's final response; using plan text here can cause callers to receive a plan instead of the assistant message (especially since plan updates can occur mid-turn after assistant output). Suggest only updating finalText from assistant_message items, or only using plan text as a fallback when no assistant text was produced by the time the turn completes.

Copilot uses AI. Check for mistakes.
Comment on lines +2447 to 2453
private resolveCollaborationMode(): {
mode: string;
settings: Record<string, unknown>;
name: string;
} | null {
const match = this.findCollaborationMode(this.planModeEnabled ? "plan" : "code");
if (!match) return null;
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collaboration mode resolution no longer considers currentMode (auto vs full-access vs read-only), and setMode() no longer refreshes resolvedCollaborationMode. That means switching modes won't affect the collaborationMode settings sent to Codex, even if the app-server exposes distinct collaboration modes for those permissions. Suggest either (1) reintroducing mode-aware selection (e.g., treat full-access differently from auto), and call refreshResolvedCollaborationMode() from setMode(), or (2) removing/clarifying currentMode's relationship to collaboration modes so the behavior is intentional and consistent.

Copilot uses AI. Check for mistakes.
Comment on lines +3976 to +3984
async getDiagnostic(): Promise<{ diagnostic: string }> {
try {
const available = await this.isAvailable();
const resolvedBinary = findExecutable("codex");
const entries: Array<{ label: string; value: string }> = [
{
label: "Binary",
value: resolvedBinary ?? "not found",
},
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description focuses on a turn-completion race + client stream merge, but this diff also introduces substantial additional behavior (plan items re-mapped into tool calls, question permission handling changes, new feature flags, Windows spawn quoting, and a new getDiagnostic() API). Suggest updating the PR description to reflect these changes or splitting into separate PRs so the race fix can be reviewed/landed independently.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants