fix: auto-scroll follows thinking streams + Codex tool display by openasocket · Pull Request #410 · RunMaestro/Maestro

openasocket · 2026-02-18T20:33:56Z

Summary

Replace filteredLogs.length auto-scroll dependency with MutationObserver to detect in-place content changes (thinking stream text growth, tool status updates)
Add programmatic scroll guard to prevent the scroll handler from falsely pausing auto-scroll during effect-driven scrollTo() calls
Fix Codex tool execution display: carry tool name from tool_call to tool_result, handle array-typed command args, add Codex-specific arg extraction, truncate large outputs

Motivation

Auto-scroll stops following thinking/tool streams (Fixes #402, supersedes #406)

The auto-scroll feature (PR #390) depends on filteredLogs.length to trigger scroll-to-bottom. However, thinking chunks update the last log entry's text in-place without changing the array length:

// useAgentListeners.ts — thinking chunk handler
logs: [...tab.logs.slice(0, -1), { ...lastLog, text: combinedText }]
//                                               ^^^^^^^^^^^^^^^^^^
//                                               text grows, length stays the same

Timeline of the bug:

T1: First thinking chunk → new log added → length 1→2 → scroll fires ✓
T2: Second thinking chunk → text updated → length still 2 → NO scroll ✗
T3: Third thinking chunk → text updated → length still 2 → NO scroll ✗
T4: Tool event → new log added → length 2→3 → scroll fires ✓
T5: More thinking → text updated → length still 3 → NO scroll ✗

PR #406 proposed a content-fingerprint memo (lastVisibleLogActivityKey) that creates a composite key from the last log's metadata. This works but couples the scroll trigger to React's data model. Our approach uses a MutationObserver on the scroll container instead, which:

Catches all content changes — new DOM nodes, text mutations — regardless of how the React data model produces them
Is framework-agnostic — doesn't need to know about filteredLogs, collapsedLogs, or log data shapes
Naturally throttles — MutationObserver batches mutations and delivers them asynchronously, providing built-in coalescing
Has zero React overhead — no additional memos, no dependency array churn, no re-renders

Programmatic scroll falsely pauses auto-scroll (secondary race condition)

When the auto-scroll effect calls scrollTo({ behavior: 'auto' }), the instant scroll fires the onScroll handler synchronously. If scrollHeight just grew from new content but the scroll position hasn't propagated yet, the handler computes atBottom = false and sets autoScrollPaused = true — pausing auto-scroll during a programmatic scroll. This creates intermittent freezes where the viewport stops following output mid-stream.

Fixed with a ref-based guard (isProgrammaticScrollRef) set before scrollTo() and cleared via queueMicrotask. The scroll handler checks this flag before pausing.

Codex tool executions show no command details

Codex bash tool entries render as just bash ● / bash ✓ with no information about what command was executed. Three sub-bugs:

Missing tool name on completion: Codex emits tool_call and tool_result as separate item.completed JSONL events. The parser transforms tool_result without a toolName field (only tool_call items include item.tool). StdoutHandler's gate (event.toolName check) silently drops the event — the renderer never receives "completed" status.

Array commands not rendered: Codex tool_call items use args: { command: ['ls', '-la'] } (string array). The renderer's safeStr(toolInput.command) returns null for non-strings, so no command detail is displayed. This bug is a new finding not addressed in PR #407's tool fix (commit 451f8871), which added lastToolName tracking and Codex arg patterns but didn't handle the array command format.

No output truncation: Large tool outputs (e.g., cat of a big file) create oversized log entries with no limit.

Changes

Commit 1: `638db6ff` — MutationObserver auto-scroll

Replace the filteredLogs.length dependency in the auto-scroll useEffect with a MutationObserver on scrollContainerRef:

const observer = new MutationObserver(() => {
    if (shouldAutoScroll()) scrollToBottom();
});
observer.observe(container, {
    childList: true,    // New/removed DOM nodes (new log entries)
    subtree: true,      // Watch all descendants
    characterData: true, // Text node mutations (thinking stream growth)
});

The observer is set up and torn down inside the effect, keyed on [session.inputMode, autoScrollAiMode, autoScrollPaused]. The filteredLogs.length dependency is completely removed from this effect.

Intentionally kept: The "detect new messages" effect and handleScrollInner still use filteredLogs.length for unread badge counting and read state tracking — in-place text updates should NOT increment unread counts.

Commit 2: `35b86947` — Programmatic scroll guard

Add isProgrammaticScrollRef (boolean ref)
Set true before scrollTo(), clear via queueMicrotask after
Check !isProgrammaticScrollRef.current in handleScrollInner before setting autoScrollPaused = true

Why queueMicrotask not setTimeout(0): microtasks run after the current task's synchronous work but before the next macro task. Since behavior: 'auto' scrolls synchronously, the onScroll event fires in the same task. The microtask clears the flag before any subsequent user scroll could fire.

Commit 3: `fdf8329f` — Tests

4 new tests in auto-scroll.test.tsx:

Thinking text growth triggers auto-scroll (same array length)
Unread badge does NOT increment on in-place text updates
Tool event appends trigger auto-scroll
Genuine user scroll-up still pauses auto-scroll (guard doesn't interfere)

Includes a class-based MockMutationObserver for JSDOM compatibility.

Commit 4: `c276056c` — Codex tool display

Parser (codex-output-parser.ts):

Add lastToolName field: stored on tool_call, carried to tool_result, cleared after use
Add MAX_TOOL_OUTPUT_LENGTH (10K chars) truncation in decodeToolOutput

Renderer (TerminalOutput.tsx):

Add safeCommand() helper: handles both string and string-array command values (joins arrays with spaces)
Add truncate() helper for length-limited display
Add Codex-specific arg patterns: path, cmd, code, content
Change safeStr(toolInput.command) → safeCommand(toolInput.command)

Tests (codex-output-parser.test.ts):

Tool name carryover from tool_call to tool_result
lastToolName reset after consumption (no stale names)
Output truncation at 10K chars with marker

Files Changed (4 files, +375 / -29)

File	Changes
`src/renderer/components/TerminalOutput.tsx`	MutationObserver effect, `isProgrammaticScrollRef` guard, `safeCommand()` helper, Codex arg extraction
`src/main/parsers/codex-output-parser.ts`	`lastToolName` tracking, output truncation
`src/__tests__/renderer/components/auto-scroll.test.tsx`	4 new tests + MutationObserver mock
`src/__tests__/main/parsers/codex-output-parser.test.ts`	3 new tests for tool name carryover + truncation

Fixes AI chat auto-scroll does not follow streaming updates when Thinking/Tool Calls are visible #402
Supersedes fix: keep AI auto-scroll synced with streaming updates #406 (closed — content-fingerprint approach replaced with MutationObserver)
Extends feat: add auto-scroll toggle for AI output #390 (original auto-scroll feature, merged)
Overlaps with Codex integration fixes + agent version detection #407 (Codex tool fixes — lastToolName + truncation exist there too, but array command handling is new here)

Test plan

npm run lint — clean
npm run test — 468 test files, 19,984 tests pass
New auto-scroll tests: thinking stream growth, unread badge isolation, tool event scroll, user scroll-up guard
New Codex parser tests: tool name carryover, stale name reset, output truncation
Manual: enable thinking display → start streaming prompt → verify viewport stays pinned during thinking text growth
Manual: scroll up during streaming → verify auto-scroll pauses → scroll back down → verify it resumes
Manual: Codex agent → run a command → verify tool entries show command details (not just "bash ●")

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Tool context is preserved across consecutive tool events so displayed tool names remain consistent.
- Terminal displays additional tool details and trims very long tool content for readability.
Bug Fixes
- Improved auto-scroll: programmatic scrolling no longer interrupts auto-scroll; user scrolls still pause it.
- Unread badge/alignment updated to reflect actual scroll-bottom state.
Tests
- Added coverage for tool name carryover, output truncation, and mutation-driven scroll behavior (includes test polyfill).

…erver The auto-scroll effect previously depended on filteredLogs.length, which missed in-place text updates during thinking/tool streaming (the array length doesn't change when thinking chunks update the last log entry). Replace with a MutationObserver on the scroll container that detects all DOM mutations — childList for new nodes, characterData for text changes, subtree to watch all descendants. This catches both new log entries and in-place thinking stream growth. Also adds a clarifying comment to the detect-new-messages effect explaining why it intentionally still uses filteredLogs.length (unread badge counts should only increment on new entries, not text updates). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ausing Adds isProgrammaticScrollRef that the MutationObserver scrollToBottom sets before calling scrollTo() and clears via queueMicrotask. The scroll handler checks this flag before pausing auto-scroll, preventing the race condition where a programmatic scroll triggers onScroll with a stale scrollTop and falsely sets autoScrollPaused=true. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…scroll guard Add thinking stream auto-scroll tests verifying: - In-place text growth triggers auto-scroll via MutationObserver - Unread badge does not increment on thinking text growth - Tool event appended after thinking triggers auto-scroll Add programmatic scroll guard test verifying: - Genuine user scroll-up still pauses auto-scroll Includes MutationObserver mock (class-based) for JSDOM compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…, handle array commands Add lastToolName tracking in Codex parser to carry tool name from tool_call to tool_result events (Codex emits these as separate item.completed events). Fix TerminalOutput tool detail extraction to handle array-typed command args (Codex uses ['ls', '-la'] format). Add Codex-specific arg patterns (path, cmd, code, content). Truncate large tool outputs to 10K chars. Fixes missing command details in Codex bash tool execution display. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-02-18T20:34:17Z

No actionable comments were generated in the recent review. 🎉

📝 Walkthrough

Walkthrough

Adds tool-name carryover and output truncation to CodexOutputParser, replaces filteredLogs-triggered auto-scroll with a MutationObserver-driven flow in TerminalOutput (with a programmatic-scroll guard and Codex command/content handling), and adds test scaffolding for MutationObserver plus parser tests for carryover and truncation.

Changes

Cohort / File(s)	Summary
Parser Implementation `src/main/parsers/codex-output-parser.ts`	Added `lastToolName` state to carry a `tool_call`'s tool name to the next `tool_result`, introduced `MAX_TOOL_OUTPUT_LENGTH = 10000`, improved `decodeToolOutput` to handle strings and byte arrays robustly, and truncate long outputs with a truncation marker and fallback/error reporting.
Parser Tests `src/__tests__/main/parsers/codex-output-parser.test.ts`	New tests validating tool name carryover from `tool_call` to `tool_result`, consumption (clearing) of carried name after use, and truncation behavior for outputs exceeding 10,000 characters versus shorter outputs.
TerminalOutput UI & Auto-scroll `src/renderer/components/TerminalOutput.tsx`	Replaced filteredLogs-length trigger with a `MutationObserver` to detect DOM mutations and invoke scroll-to-bottom when appropriate. Added `isProgrammaticScrollRef` to ignore programmatic scrolls for pause/resume logic. Added `safeCommand` and `truncate` helpers and expanded Codex tool detail fields (path, cmd, code, truncated content).
Auto-Scroll Test Scaffolding `src/__tests__/renderer/components/auto-scroll.test.tsx`	Introduced a MockMutationObserver polyfill for JSDOM tests (stores callback on `window.__mutationCallback`, provides `observe`, `disconnect`, `takeRecords`) to allow manual triggering of mutation callbacks; scaffolding duplicated across test sections.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant Terminal as TerminalOutput
    participant Observer as MutationObserver
    participant Container as Scroll Container

    User->>Container: Scroll away from bottom
    Terminal->>Terminal: Detect user scroll -> pause auto-scroll
    Note over Observer,Container: Streaming updates mutate DOM
    Observer->>Terminal: Mutation callback
    Terminal->>Terminal: shouldAutoScroll() false -> do not scroll
    User->>Terminal: Scroll to bottom or re-enable auto-scroll
    Terminal->>Terminal: Set isProgrammaticScrollRef = true
    Terminal->>Container: scrollToBottom()
    Terminal->>Terminal: Clear isProgrammaticScrollRef

sequenceDiagram
    participant ToolCall as tool_call Event
    participant Parser as CodexOutputParser
    participant Storage as lastToolName
    participant ToolResult as tool_result Event

    ToolCall->>Parser: Receive tool_call (toolName='shell')
    Parser->>Storage: Store lastToolName = 'shell'
    Parser->>Parser: Emit tool_call event

    ToolResult->>Parser: Receive tool_result (no toolName)
    Parser->>Storage: Read lastToolName ('shell')
    Parser->>Parser: Emit tool_result with toolName='shell' and status='completed'
    Parser->>Storage: Clear lastToolName

    ToolResult->>Parser: Receive another tool_result
    Parser->>Storage: Read lastToolName (null)
    Parser->>Parser: Emit tool_result with toolName=undefined

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	Title clearly summarizes the main changes: auto-scroll behavior following thinking streams and improvements to Codex tool display, matching the primary objectives of the PR.
Linked Issues check	✅ Passed	All coding requirements from issue `#402` are met: MutationObserver detects streaming updates, programmatic scroll guard prevents auto-scroll pausing, and unread-badge semantics are preserved.
Out of Scope Changes check	✅ Passed	All changes are directly related to addressing auto-scroll issues and Codex tool display improvements outlined in the PR objectives; no unrelated modifications detected.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-02-18T20:36:38Z

Greptile Summary

This PR fixes two critical auto-scroll bugs and enhances Codex tool display. The MutationObserver-based approach elegantly solves thinking stream auto-scroll failures by observing DOM mutations instead of React state, and the isProgrammaticScrollRef guard prevents race conditions during programmatic scrolls.

Key Changes:

Replaced filteredLogs.length dependency with MutationObserver to detect in-place text updates (thinking streams, tool status)
Added isProgrammaticScrollRef guard with queueMicrotask to prevent false auto-scroll pausing during effect-driven scrolls
Fixed Codex tool display: carry tool name from tool_call to tool_result, handle array-typed command args, add Codex arg patterns (path, cmd, code, content)
Added output truncation (10K chars) to prevent oversized log entries
Comprehensive test coverage: 4 new auto-scroll tests with MockMutationObserver, 3 new Codex parser tests

Architecture Highlights:

The separation of concerns is well-executed: MutationObserver handles scroll triggering, while filteredLogs.length effect handles unread badge logic (only NEW entries increment badge, not in-place updates)
The queueMicrotask timing is correct for synchronous behavior: 'auto' scrolls — flag is cleared after the same-tick onScroll event but before next user interaction
Tests validate both the fix (thinking text growth triggers scroll) and that existing behavior isn't broken (user scroll-up still pauses, unread badge doesn't increment on in-place updates)

Confidence Score: 5/5

Safe to merge — well-architected fix with comprehensive test coverage and clear separation of concerns
All 468 test files pass (19,984 tests), new tests validate both fixes and unchanged behavior, code follows project patterns (tabs, error handling, performance memos), MutationObserver is a standard browser API with proper cleanup, programmatic scroll guard uses correct microtask timing
No files require special attention

Important Files Changed

Filename	Overview
src/renderer/components/TerminalOutput.tsx	MutationObserver replaces `filteredLogs.length` dep for auto-scroll, `isProgrammaticScrollRef` guard added to prevent false pauses, `safeCommand()` handles array-typed commands, Codex-specific arg patterns added
src/main/parsers/codex-output-parser.ts	`lastToolName` state added to carry tool name from `tool_call` to `tool_result`, output truncation at 10K chars in `decodeToolOutput()`
src/tests/renderer/components/auto-scroll.test.tsx	4 new tests: thinking text growth triggers scroll, unread badge doesn't increment on in-place updates, tool events trigger scroll, user scroll-up still pauses. Includes `MockMutationObserver` for JSDOM.
src/tests/main/parsers/codex-output-parser.test.ts	3 new tests: tool name carryover from `tool_call` to `tool_result`, `lastToolName` reset after consumption, output truncation at 10K chars with marker

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[New log entry or text update] --> B{Input Mode?}
    B -->|Terminal| C[Auto-scroll enabled]
    B -->|AI| D{autoScrollAiMode?}
    D -->|No| E[No auto-scroll]
    D -->|Yes| F{autoScrollPaused?}
    F -->|Yes| E
    F -->|No| C
    
    C --> G[MutationObserver detects change]
    G --> H[Calls scrollToBottom]
    H --> I[Set isProgrammaticScrollRef = true]
    I --> J[scrollTo top: scrollHeight, behavior: auto]
    J --> K[onScroll handler fires synchronously]
    K --> L{isProgrammaticScrollRef?}
    L -->|Yes| M[Skip pause logic]
    L -->|No| N{At bottom?}
    N -->|No| O[Set autoScrollPaused = true]
    N -->|Yes| P[Clear new message badge]
    M --> Q[queueMicrotask clears flag]
    Q --> R[Ready for next interaction]
    
    S[User scrolls up] --> T[onScroll handler]
    T --> U{isProgrammaticScrollRef?}
    U -->|No genuine user scroll| V[Set autoScrollPaused = true]
    U -->|Yes| W[Ignore - programmatic]
    
    X[filteredLogs.length changes] --> Y{New entries added?}
    Y -->|Yes| Z{At bottom?}
    Z -->|No| AA[Increment unread badge]
    Z -->|Yes| AB[Update read state]
    Y -->|No in-place update| AC[No badge change]

_{Last reviewed commit: c276056}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/parsers/codex-output-parser.ts`:
- Around line 364-368: Replace the silent catch around Buffer.from(output) so it
reports the caught error to Sentry before falling back: add an import from
'../utils/sentry' and in the catch block capture the exception (e.g.,
Sentry.captureException(err)) where err is the caught error, then set decoded =
output.toString(); ensure you reference the same decoded and output variables
and the Buffer.from(...) call when locating the code to change.

In `@src/renderer/components/TerminalOutput.tsx`:
- Around line 1489-1502: The guard is cleared too early via queueMicrotask
causing a race with the throttled onScroll; update the logic around
isProgrammaticScrollRef and the throttled handler (the ref,
scrollContainerRef.current.scrollTo call, and the useThrottledCallback-based
onScroll) so the guard remains true for the throttle window: either postpone
clearing the guard by using the same throttle delay (e.g., setTimeout with the
throttle duration) after calling scrollTo, or move the guard-reset into the
throttled onScroll callback itself and check isProgrammaticScrollRef at the
start of that throttled handler before it clears the flag; adjust whichever
approach you pick to use the existing throttle duration used by
useThrottledCallback to avoid the race.

src/main/parsers/codex-output-parser.ts

src/renderer/components/TerminalOutput.tsx

…tion Address CodeRabbit review on PR RunMaestro#410: 1. codex-output-parser.ts: Report Buffer.from() decode failures to Sentry with context (operation, outputType, outputLength) instead of silently swallowing the error. 2. TerminalOutput.tsx: Fix race condition where queueMicrotask cleared isProgrammaticScrollRef before the throttled scroll handler (16ms delay) could check it. The guard flag is now consumed (checked and cleared) inside handleScrollInner itself, eliminating the timing dependency between microtasks and the throttle window. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

openasocket · 2026-02-19T03:15:13Z

Addressed Review Findings — `ce7b086d`

Both actionable items from the CodeRabbit review have been fixed:

1. Sentry reporting for `Buffer.from()` decode errors

File: src/main/parsers/codex-output-parser.ts

The catch block in decodeToolOutput previously swallowed errors silently. Now imports captureException from ../utils/sentry and reports the error with context (operation, outputType, outputLength) before falling back to output.toString(). This follows the project's error handling guidelines — recoverable errors are handled gracefully but still reported to Sentry for visibility.

2. Throttle race condition on `isProgrammaticScrollRef`

File: src/renderer/components/TerminalOutput.tsx

The bug: scrollToBottom() set isProgrammaticScrollRef = true, called scrollTo(), then cleared the flag via queueMicrotask (~0ms). But handleScroll is wrapped in useThrottledCallback(fn, 16) — a leading-edge throttle that defers trailing calls via setTimeout(fn, delay - elapsed). When a scroll event fell into the throttle window, the deferred handler fired ~16ms later, after the microtask had already cleared the guard, causing a false auto-scroll pause.

The fix: Removed the queueMicrotask clear entirely. Instead, the guard flag is now consumed (checked and cleared) inside handleScrollInner itself:

} else if (autoScrollAiMode) {
    if (isProgrammaticScrollRef.current) {
        // Consume the guard — this was our own scrollTo()
        isProgrammaticScrollRef.current = false;
    } else {
        // Genuine user scroll-up
        setAutoScrollPaused(true);
    }
}

This eliminates the timing dependency between microtasks and the throttle window. The flag persists until the scroll handler actually processes it, regardless of when the throttled invocation fires.

Tests: All 65 tests pass (auto-scroll + codex parser), TypeScript compiles clean.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/parsers/codex-output-parser.ts`:
- Around line 167-171: The parser currently uses lastToolName to associate
tool_result with the preceding tool_call which breaks when events are
out-of-order; add an optional tool_call_id?: string field to the CodexItem
interface, extract the tool_call_id from incoming Codex output in parseJsonLine
(or the code that builds CodexItem), and replace lastToolName-based pairing with
an ID map (e.g., toolCallIdToName: Map<string,string>) that records tool_call id
→ tool name when handling items of type 'tool_call' and looks up/removes that
entry when handling 'tool_result'; update any places using lastToolName (and the
lastToolName field itself) to use this id-based map so tool_result maps
correctly and the existing codex-session-storage cleanup logic (which expects
tool_call_id) receives the id on CodexItem.

In `@src/renderer/components/TerminalOutput.tsx`:
- Around line 1372-1382: The guard isProgrammaticScrollRef.current can remain
true if scrollToBottom is a no-op, so update the scroll handler around
autoScrollAiMode (the throttled handler that currently checks
isProgrammaticScrollRef.current and calls setAutoScrollPaused(true)) to always
clear isProgrammaticScrollRef.current on any scroll event (not only when true)
and additionally, in the code path that performs scrollToBottom, start a short
fallback timer (>= the throttle delay, e.g. 16ms) that clears
isProgrammaticScrollRef.current after the delay to cover no-scroll cases; ensure
the timer is cancelled or reset on subsequent programmatic scrolls to avoid
races and reuse the same references (isProgrammaticScrollRef, scrollToBottom,
setAutoScrollPaused) so behavior remains consistent.

coderabbitai · 2026-02-19T03:23:22Z

src/main/parsers/codex-output-parser.ts

+	// Track tool name from tool_call to carry over to tool_result
+	// (Codex emits tool_call and tool_result as separate item.completed events,
+	// but tool_result doesn't include the tool name)
+	private lastToolName: string | null = null;
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the codex-output-parser.ts file to understand the context wc -l src/main/parsers/codex-output-parser.ts

Repository: RunMaestro/Maestro

Length of output: 105

🏁 Script executed:

# Read the relevant sections of the file sed -n '160,180p' src/main/parsers/codex-output-parser.ts sed -n '285,330p' src/main/parsers/codex-output-parser.ts

Repository: RunMaestro/Maestro

Length of output: 1903

🏁 Script executed:

# Search for test files or sample data related to Codex find . -type f \( -name "*codex*" -o -name "*test*" -o -name "*spec*" \) | head -20

Repository: RunMaestro/Maestro

Length of output: 853

🏁 Script executed:

# Look for tool_call/tool_result in the codebase rg "tool_call|tool_result" -t ts -t tsx -t json -C 3 | head -100

Repository: RunMaestro/Maestro

Length of output: 89

🏁 Script executed:

# Search for codex parser tests more specifically find . -type f -name "*.test.ts" -o -name "*.spec.ts" | xargs grep -l "codex\|CodexOutputParser" 2>/dev/null | head -10

Repository: RunMaestro/Maestro

Length of output: 570

🏁 Script executed:

# Search for Codex type definitions or interfaces rg "interface.*Codex|type.*Codex|tool_call|tool_result" -t ts --max-count 20

Repository: RunMaestro/Maestro

Length of output: 9060

🏁 Script executed:

# Look for any fixture or test data files find . -type f \( -name "*fixture*" -o -name "*mock*" -o -name "*.json" \) | grep -i codex

Repository: RunMaestro/Maestro

Length of output: 44

🏁 Script executed:

# Check the entire codex-output-parser.ts file structure cat -n src/main/parsers/codex-output-parser.ts | head -100

Repository: RunMaestro/Maestro

Length of output: 3967

🏁 Script executed:

# Read the test file to understand test coverage cat -n src/__tests__/main/parsers/codex-output-parser.test.ts

Repository: RunMaestro/Maestro

Length of output: 20682

🏁 Script executed:

# Check the CodexItem interface definition sed -n '1,200p' src/main/parsers/codex-output-parser.ts | grep -A 20 "interface CodexItem"

Repository: RunMaestro/Maestro

Length of output: 505

🏁 Script executed:

# Look for more context on tool_call_id usage rg "tool_call_id" -B 2 -A 2 src/main/parsers/ src/__tests__/main/parsers/

Repository: RunMaestro/Maestro

Length of output: 44

🏁 Script executed:

# Search for Codex documentation references rg "codex|Codex" src/main/parsers/codex-output-parser.ts | head -10

Repository: RunMaestro/Maestro

Length of output: 501

🏁 Script executed:

# Look at how codex-session-storage actually uses tool_call_id sed -n '1,50p' src/main/storage/codex-session-storage.ts sed -n '150,250p' src/main/storage/codex-session-storage.ts | grep -B 5 -A 5 tool_call_id

Repository: RunMaestro/Maestro

Length of output: 1740

🏁 Script executed:

# Check if there's any real Codex output sample or fixture find . -type f \( -name "*.json" -o -name "*.jsonl" -o -name "*.txt" \) -exec grep -l "item.completed.*tool_call\|tool_call.*tool_result" {} \; 2>/dev/null

Repository: RunMaestro/Maestro

Length of output: 44

🏁 Script executed:

# Verify tool_call_id is actually in CodexItem or if it's missing rg "tool_call_id" src/main/parsers/codex-output-parser.ts src/__tests__/main/parsers/codex-output-parser.test.ts

Repository: RunMaestro/Maestro

Length of output: 44

🏁 Script executed:

# Get the actual codex-session-storage code that uses tool_call_id sed -n '150,200p' src/main/storage/codex-session-storage.ts

Repository: RunMaestro/Maestro

Length of output: 1481

🏁 Script executed:

# Check the CodexItem definition more carefully for any ID field sed -n '1,150p' src/main/parsers/codex-output-parser.ts | grep -A 30 "interface CodexItem"

Repository: RunMaestro/Maestro

Length of output: 421

🏁 Script executed:

# Search for any evidence of Codex sending tool_call_id or similar ID mapping rg "tool_call_id\|item\.id.*tool" -C 3 src/main/storage/ src/main/parsers/

Repository: RunMaestro/Maestro

Length of output: 44

🏁 Script executed:

# Read more of codex-session-storage.ts to see tool_call_id context wc -l src/main/storage/codex-session-storage.ts sed -n '200,300p' src/main/storage/codex-session-storage.ts

Repository: RunMaestro/Maestro

Length of output: 3589

🏁 Script executed:

# Search for where tool_call_id is mentioned in session storage grep -n "tool_call_id" src/main/storage/codex-session-storage.ts

Repository: RunMaestro/Maestro

Length of output: 259

🏁 Script executed:

# Check the actual implementation of parseJsonLine to see if it ever extracts or stores IDs sed -n '200,350p' src/main/parsers/codex-output-parser.ts

Repository: RunMaestro/Maestro

Length of output: 3894

🏁 Script executed:

# Check what interface at line 1373 has tool_call_id sed -n '1360,1380p' src/main/storage/codex-session-storage.ts

Repository: RunMaestro/Maestro

Length of output: 548

🏁 Script executed:

# Get the full context of how tool_call_id is used in cleanup logic sed -n '1455,1475p' src/main/storage/codex-session-storage.ts

Repository: RunMaestro/Maestro

Length of output: 865

🏁 Script executed:

# Check if there are any comments in the parser about Codex output format or IDs sed -n '1,50p' src/main/parsers/codex-output-parser.ts

Repository: RunMaestro/Maestro

Length of output: 1870

🏁 Script executed:

# Search for any references to Codex CLI schema or what fields it actually outputs rg "tool_call_id|Codex.*output|Codex.*schema" -C 2 | head -50

Repository: RunMaestro/Maestro

Length of output: 3840

🏁 Script executed:

# Check if there's a reference link to Codex CLI documentation rg "github.com/openai/codex" -B 5 -A 5

Repository: RunMaestro/Maestro

Length of output: 3526

🏁 Script executed:

# Look for any TODO or FIXME related to tool_call_id or ID mapping rg "TODO|FIXME" src/main/parsers/codex-output-parser.ts src/main/storage/codex-session-storage.ts

Repository: RunMaestro/Maestro

Length of output: 44

Add tool_call_id field to CodexItem interface and use ID-based mapping for tool_call/tool_result pairing.

The current implementation relies on strict sequencing via lastToolName, which is fragile if Codex emits multiple tool_calls before tool_results or delivers them out of order. The codex-session-storage.ts cleanup logic already expects tool_call_id to be available (line 1462), but the parser doesn't extract or surface it. Add tool_call_id?: string to the CodexItem interface, extract it from Codex output, and use it for mapping tool results to their corresponding tool calls instead of relying solely on sequential tracking.

Test case showing current failure mode (lines 427–443):

it('should reset lastToolName after tool_result consumption', () => { const p = new CodexOutputParser(); // tool_call → tool_result (consumes name) → another tool_result (no name) p.parseJsonLine(JSON.stringify({ type: 'item.completed', item: { type: 'tool_call', tool: 'shell', args: {} }, })); p.parseJsonLine(JSON.stringify({ type: 'item.completed', item: { type: 'tool_result', output: 'ok' }, })); const orphan = p.parseJsonLine(JSON.stringify({ type: 'item.completed', item: { type: 'tool_result', output: 'orphan' }, })); expect(orphan?.toolName).toBeUndefined(); });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/main/parsers/codex-output-parser.ts` around lines 167 - 171, The parser currently uses lastToolName to associate tool_result with the preceding tool_call which breaks when events are out-of-order; add an optional tool_call_id?: string field to the CodexItem interface, extract the tool_call_id from incoming Codex output in parseJsonLine (or the code that builds CodexItem), and replace lastToolName-based pairing with an ID map (e.g., toolCallIdToName: Map<string,string>) that records tool_call id → tool name when handling items of type 'tool_call' and looks up/removes that entry when handling 'tool_result'; update any places using lastToolName (and the lastToolName field itself) to use this id-based map so tool_result maps correctly and the existing codex-session-storage cleanup logic (which expects tool_call_id) receives the id on CodexItem.

coderabbitai · 2026-02-19T03:23:22Z

src/renderer/components/TerminalOutput.tsx

 			} else if (autoScrollAiMode) {
-				// Pause auto-scroll when user scrolls away from bottom
-				setAutoScrollPaused(true);
+				if (isProgrammaticScrollRef.current) {
+					// This scroll event was triggered by our own scrollTo() call —
+					// consume the guard flag here inside the throttled handler to avoid
+					// the race where queueMicrotask clears the flag before a deferred
+					// throttled invocation fires (throttle delay is 16ms > microtask).
+					isProgrammaticScrollRef.current = false;
+				} else {
+					// Genuine user scroll away from bottom — pause auto-scroll
+					setAutoScrollPaused(true);
+				}


⚠️ Potential issue | 🟡 Minor

Guard can stick true when scrollToBottom doesn’t emit a scroll event.
If the scroll lands at bottom or is a no-op, isProgrammaticScrollRef stays true and the first user scroll away won’t pause auto-scroll. That can yank the user back on the next mutation. Consider clearing the guard on any scroll and add a short fallback timer after scrollToBottom (>= throttle delay) to handle no-scroll cases.

🛠️ Suggested fix

const handleScrollInner = useCallback(() => { if (!scrollContainerRef.current) return; const { scrollTop, scrollHeight, clientHeight } = scrollContainerRef.current; // Consider "at bottom" if within 50px of the bottom const atBottom = scrollHeight - scrollTop - clientHeight < 50; + const wasProgrammatic = isProgrammaticScrollRef.current; + if (wasProgrammatic) { + isProgrammaticScrollRef.current = false; + } setIsAtBottom(atBottom); @@ - } else if (autoScrollAiMode) { - if (isProgrammaticScrollRef.current) { - // This scroll event was triggered by our own scrollTo() call — - // consume the guard flag here inside the throttled handler to avoid - // the race where queueMicrotask clears the flag before a deferred - // throttled invocation fires (throttle delay is 16ms > microtask). - isProgrammaticScrollRef.current = false; - } else { - // Genuine user scroll away from bottom — pause auto-scroll - setAutoScrollPaused(true); - } - } + } else if (autoScrollAiMode) { + if (!wasProgrammatic) { + // Genuine user scroll away from bottom — pause auto-scroll + setAutoScrollPaused(true); + } + }

const scrollToBottom = () => { if (!scrollContainerRef.current) return; requestAnimationFrame(() => { if (scrollContainerRef.current) { // Set guard flag BEFORE scrollTo — the throttled scroll handler // checks this flag and consumes it (clears it) when it fires, // preventing the programmatic scroll from being misinterpreted // as a user scroll-up that should pause auto-scroll. isProgrammaticScrollRef.current = true; scrollContainerRef.current.scrollTo({ top: scrollContainerRef.current.scrollHeight, behavior: 'auto', }); + setTimeout(() => { + isProgrammaticScrollRef.current = false; + }, 32); } }); };

Also applies to: 1494-1501

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/renderer/components/TerminalOutput.tsx` around lines 1372 - 1382, The guard isProgrammaticScrollRef.current can remain true if scrollToBottom is a no-op, so update the scroll handler around autoScrollAiMode (the throttled handler that currently checks isProgrammaticScrollRef.current and calls setAutoScrollPaused(true)) to always clear isProgrammaticScrollRef.current on any scroll event (not only when true) and additionally, in the code path that performs scrollToBottom, start a short fallback timer (>= the throttle delay, e.g. 16ms) that clears isProgrammaticScrollRef.current after the delay to cover no-scroll cases; ensure the timer is cancelled or reset on subsequent programmatic scrolls to avoid races and reuse the same references (isProgrammaticScrollRef, scrollToBottom, setAutoScrollPaused) so behavior remains consistent.

When scrollTo() is called but the container is already at the bottom, the browser may not fire a scroll event. The guard flag stays true indefinitely, causing the next genuine user scroll-up to be silently consumed instead of pausing auto-scroll. Add a 32ms setTimeout fallback (2x the 16ms throttle window) that clears the flag if the throttled handler hasn't consumed it. If the handler does consume it first, the timeout harmlessly sets false again. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

openasocket · 2026-02-19T05:40:59Z

Addressed Review Round 2 — `a4b74d0e`

Fixed: Stale guard on no-op scroll (Finding #4)

When scrollTo() is called but the container is already at the bottom, the browser doesn't fire a scroll event. The isProgrammaticScrollRef guard stays true indefinitely — the next genuine user scroll-up gets silently consumed instead of pausing auto-scroll.

Fix: Added a 32ms setTimeout fallback in scrollToBottom that clears the flag if the throttled handler hasn't consumed it. 32ms is 2x the 16ms throttle window, guaranteeing the handler has had its chance. If the handler does consume it first, the timeout harmlessly sets false again (idempotent).

Skipped: tool_call_id mapping (Finding #3)

The suggestion to replace lastToolName with an ID-based Map<string, string> keyed by tool_call_id is over-engineered for our use case. Codex CLI outputs sequential JSONL — tool_call always immediately precedes its tool_result in the stdout stream. Out-of-order events can't happen with CLI stdout. The existing test at lines 427-443 validates the intended behavior: an orphaned tool_result correctly gets undefined as its toolName.

Tests: All 19 auto-scroll tests pass, TypeScript compiles clean.

openasocket and others added 4 commits February 18, 2026 11:09

coderabbitai bot requested changes Feb 18, 2026

View reviewed changes

src/main/parsers/codex-output-parser.ts Show resolved Hide resolved

src/renderer/components/TerminalOutput.tsx Outdated Show resolved Hide resolved

coderabbitai bot requested changes Feb 19, 2026

View reviewed changes

pedramamini merged commit 30e8aa7 into RunMaestro:main Feb 19, 2026
1 check passed

This was referenced Feb 28, 2026

fix: complete theme token system with WCAG-compliant color tokens #370

Open

Adopt async file IO for history and stats #385

Open

[WIP] Symphony: Improved Terminal Experience (#160) #486

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: auto-scroll follows thinking streams + Codex tool display#410

fix: auto-scroll follows thinking streams + Codex tool display#410
pedramamini merged 6 commits intoRunMaestro:mainfrom
openasocket:fix/auto-scroll-improvements

openasocket commented Feb 18, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 18, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

greptile-apps bot commented Feb 18, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

openasocket commented Feb 19, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 19, 2026

Uh oh!

coderabbitai bot Feb 19, 2026

Uh oh!

openasocket commented Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

openasocket commented Feb 18, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Auto-scroll stops following thinking/tool streams (Fixes #402, supersedes #406)

Programmatic scroll falsely pauses auto-scroll (secondary race condition)

Codex tool executions show no command details

Changes

Commit 1: 638db6ff — MutationObserver auto-scroll

Commit 2: 35b86947 — Programmatic scroll guard

Commit 3: fdf8329f — Tests

Commit 4: c276056c — Codex tool display

Files Changed (4 files, +375 / -29)

Related

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

greptile-apps bot commented Feb 18, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

openasocket commented Feb 19, 2026

Addressed Review Findings — ce7b086d

1. Sentry reporting for Buffer.from() decode errors

2. Throttle race condition on isProgrammaticScrollRef

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

openasocket commented Feb 19, 2026

Addressed Review Round 2 — a4b74d0e

Fixed: Stale guard on no-op scroll (Finding #4)

Skipped: tool_call_id mapping (Finding #3)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

openasocket commented Feb 18, 2026 •

edited by coderabbitai bot

Loading

Commit 1: `638db6ff` — MutationObserver auto-scroll

Commit 2: `35b86947` — Programmatic scroll guard

Commit 3: `fdf8329f` — Tests

Commit 4: `c276056c` — Codex tool display

coderabbitai bot commented Feb 18, 2026 •

edited

Loading

Addressed Review Findings — `ce7b086d`

1. Sentry reporting for `Buffer.from()` decode errors

2. Throttle race condition on `isProgrammaticScrollRef`

Addressed Review Round 2 — `a4b74d0e`