refactor: architecture elevation with broker, modular CLI, and comprehensive hardening#8
Open
refactor: architecture elevation with broker, modular CLI, and comprehensive hardening#8
Conversation
Add ThreadIndex, RunRecord, BrokerState, SessionState, ParsedEndpoint, StructuredReviewOutput, and ThreadSetName types. Add ephemeral and serviceName to ThreadStartParams. Add reasoning to TurnResult.
…plate loading Add per-workspace state directory resolution (slug-hash scheme), model alias mapping (spark), effort validation with expanded levels (none, minimal), prompt template loading/interpolation, and new config fields (defaultBrokerIdleTimeout, maxRunsPerWorkspace, serviceName). Rename jobsListLimit to threadsListLimit with deprecated getter alias. Mark per-file paths (threadsFile, logsDir, etc.) as @deprecated — they remain functional until consuming modules are refactored. 26 tests covering all new functions.
Provides terminateProcessTree() and isProcessAlive() for killing process trees on Unix (SIGTERM then SIGKILL escalation) and Windows (taskkill). Will be used by the broker for cleanup and by the kill command for interrupt fallback.
Move the full JSON-RPC client implementation (AppServerClient, message formatting/parsing, and direct spawn connection) from protocol.ts into a new client.ts. The connect function is renamed to connectDirect to distinguish it from future broker-based connection strategies. protocol.ts becomes a thin re-export shim that maps connectDirect back to connect, preserving backward compatibility for all existing consumers.
… and spawn lock Implements the per-workspace broker lifecycle: - Endpoint abstraction (unix sockets / Windows named pipes) - Broker and session state persistence (broker.json, session.json) - Socket-based liveness probing with configurable timeout - Atomic spawn lock using O_CREAT|O_EXCL with jitter and stale-lock breaking - Teardown logic (process kill, socket cleanup, state cleanup) - Session ID resolution (env var with file fallback) - ensureConnection entry point with lock contention fallback to direct connection
- Add path traversal guard to loadTemplate (reject /, \, ..) - Narrow validateEffort return type to ReasoningEffort | undefined - Move BROKER_BUSY_RPC_CODE from types.ts to broker.ts (types.ts should remain types-only, no runtime values)
…e candidate
Two-layer model replaces the flat thread mapping:
- Thread Index: maps short IDs to ThreadIndexEntry with name, model, cwd,
timestamps. resolveThreadId now supports thr_ prefix lookup and returns
{shortId, threadId} or null instead of throwing.
- Run Ledger: per-execution RunRecord files in {stateDir}/runs/, with
create/load/update/list/prune operations and session-aware filtering.
- Resume Candidate: finds the latest completed task run, preferring the
current session, with thread name lookup from the index.
Legacy exports (loadThreadMapping, saveThreadMapping, updateThreadStatus,
legacyRegisterThread, legacyResolveThreadId, etc.) preserved for backward
compatibility with cli.ts and turns.test.ts — will be removed in Tasks 11-12.
Extract git operations (repo detection, default branch, diff stats, untracked file filtering, review target resolution) into a standalone module for use by the review command. Includes TDD test suite (22 tests).
Add emitProgress() with optional phase/threadId tracking that deduplicates consecutive same-phase emissions per thread. Add standalone inferPhaseFromLog() that maps log line content to RunPhase via regex patterns, enabling phase recovery from historical logs without phase metadata.
… extraction to turns Add robustness improvements to the turn lifecycle: - Notification buffering: queue item/completed notifications that arrive before turn/start returns the turnId, then replay them once known - Completion inference: 250ms debounce timer after agentMessage completes acts as safety net when turn/completed notification is lost - Reasoning extraction: capture reasoning from completed reasoning items, deduplicate identical sections, and populate TurnResult.reasoning - Structured capture: collect files changed and commands run from item/completed notifications alongside the dispatcher, with dedup - Opt out of item/reasoning/textDelta in initialize capabilities (use completed items for reasoning instead) Add belongsToTurn and extractReasoning as exported pure helpers.
…and adversarial prompt
Add src/reviews.ts with three public functions:
- validateNativeReviewTarget: rejects custom instructions for native review mode
- parseStructuredReviewOutput: extracts and validates JSON review output from
raw text, including markdown code fences and bare JSON
- formatReviewOutput: renders structured review results as human-readable text
Create src/prompts/adversarial-review.md with the adversarial review prompt
template using {{VAR}} interpolation and XML block structure.
23 tests covering validation, parsing edge cases, and formatting.
- EventDispatcher.reset() now clears lastPhase map for phase dedup - createRun/updateRun use atomic tmp+rename write pattern - updateRun separates read errors from write errors in catch blocks - Broker ensureConnection no longer writes fake socket endpoint; uses session-based approach with null endpoint/pid in broker state - BrokerState.endpoint type changed to string | null - isBrokerAlive accepts null endpoint and returns false - isProcessAlive treats EPERM as alive (process exists, no permission) - All empty catch blocks in broker.ts and threads.ts now log warnings with ENOENT filtering where appropriate - formatLocation shows lineStart even when lineEnd is null - getUntrackedFiles reads only first 8KB for binary detection instead of slurping entire file - Add tests: reset phase dedup, EPERM liveness, path traversal in loadTemplate, lineStart-only formatting, null endpoint probe
Split the monolithic cli.ts (~1440 lines) into focused command modules under src/commands/, leaving cli.ts as a thin router (~180 lines). Command modules created: - commands/shared.ts — Options, parseOptions, user config, withClient, model auto-selection, thread start/resume, result printing, PID mgmt - commands/run.ts — run + resume handler - commands/review.ts — review (all modes + resume) handler - commands/threads.ts — threads list, output, progress, delete, clean - commands/kill.ts — kill handler - commands/config.ts — config get/set, models, health - commands/approve.ts — approve + decline handlers cli.ts now handles signal registration, help text, command extraction, data directory setup, and dynamic dispatch to command modules. The deprecated 'jobs' command routes to threads with a deprecation warning. Added 'threads' as the canonical command name.
Wire all new infrastructure into commands and fix issues found during thorough end-to-end testing against a live Codex server and Codex code review. Wiring: - Commands use ensureConnection (broker) instead of direct connectDirect - Commands create/update RunRecords via the run ledger on every execution - Thread naming via thread/name/set for non-ephemeral task threads - Ephemeral flag: true for reviews, false for tasks - TUI handoff output: Codex session ID + resume command printed - Session ID tagging on all run records - Log offset tracking on run creation - resume-candidate CLI command wired into router - Install scripts build and install broker-server alongside CLI - SKILL.md updated with UUID examples, --discover flag, timeout guidance Integration fixes found during testing: - Completion inference fired prematurely on userMessage items — fixed to only debounce on agentMessage completion - legacyResolveThreadId now searches by full threadId value, enabling resume by full Codex session UUID - startOrResumeThread falls back to server when ID not found locally, registers thread on successful resume - resume-candidate searches run ledger + thread index + server discovery - Thread discovery queries all user-facing sourceKinds (cli, vscode, exec, appServer) with correct epoch-seconds timestamp conversion - Skip thread naming for ephemeral review threads - BrokerClient socket tests skip gracefully in sandboxed environments Codex review fixes: - Broker server now correctly handles client approval responses (P1) - resume-candidate honors --dir flag instead of hardcoding cwd (P2) - Removed early ensureDataDirs from CLI router; workspace dirs created lazily by getWorkspacePaths after --dir is parsed (P2) - Removed content-based command/file dedup that dropped repeated executions; dispatcher is now sole source for filesChanged and commandsRun (P3) Also: remove protocol.ts shim, rewrite integration tests as CLI smoke tests, update CLAUDE.md key files table, add broker-server build target, migrate state to per-workspace paths, add migration logic, update install scripts for broker-server.
Critical: - Wire resolveModel() into parseOptions and applyUserConfig for alias support - Remove undocumented --wait flag from SKILL.md - Add error logging to bare catch in broker re-check after lock - Fix approval response handler TCP buffering bug and log parse errors Important: - Wire getDefaultBranch() into review command for dynamic base branch detection - Track --base in explicit flags so review can distinguish CLI vs default - Wire migrateGlobalState() into getWorkspacePaths for automatic migration - Make resume catch selective — re-throw ambiguous prefix errors - Log silent broker fallback when spawn lock acquisition fails - Add error parameter to broker-server socket error handler - Narrow updateRun patch type to RunPatch (prevents accidental field overwrites) - Narrow updateThreadMeta patch type to ThreadMetaPatch - Narrow legacyUpdateThreadMeta meta type to exclude immutable fields - Remove dead ensureDataDirs function and update stale comments - Rewrite ensureConnection JSDoc to match actual implementation flow - Remove dead routeNotification function from broker-server - Fix SKILL.md reasoning effort levels to include none and minimal - Add broker-client.ts and broker-server.ts to CLAUDE.md Key Files table Suggestions: - Fix threads --all description from "cross-session" to "no display limit" - Add threadId to resume-candidate --json sample output - Add custom to --mode options and review modes documentation - Add pid type checking to loadBrokerState shape validation - Add basic shape validation to loadRun (runId, threadId, shortId)
The 250ms inference timer was firing after any agentMessage item completed, causing premature turn resolution before Codex finished tool calls. The reference implementation only triggers inference on agentMessage items with phase "final_answer" — the server marks the last agent message with this phase when the turn is effectively done. Also add newline separators between consecutive agentMessage items in EventDispatcher so multi-item output is readable.
EventDispatcher now tracks agentMessage items with phase "final_answer" separately from intermediate planning/status messages. In content-only mode, only the final answer is shown. In normal mode, intermediate agent messages are displayed as [codex] progress lines, keeping the --- Result --- section clean. getFinalAnswerOutput() returns only final_answer text, falling back to full accumulated output when no final_answer phase was seen (backward compat for simple tasks and reviews).
…path handling Three fixes: 1. broker-server: Track completed thread IDs so that when turn/completed arrives during the streaming request (same read chunk), the post-request code skips re-establishing stream ownership. Without this, the broker stays permanently busy after fast turns. 2. commands/threads: Use resolveWorkspaceDir(cwd) instead of raw cwd when calling thread/list for discovery, so threads are queried against the git repository root rather than an arbitrary subdirectory. 3. threads: Use path.sep instead of hardcoded "/" in migrateGlobalState path comparison for Windows compatibility.
…ct, allow read-only methods during active stream - CLI shutdown handler now sends turn/interrupt before closing the client, preventing orphaned turns when using the broker. Track the active turn ID via onTurnId callback from executeTurn. - Broker-server no longer clears activeStreamThreadIds when the stream-owning socket disconnects. The turn completes naturally and turn/completed clears the state, avoiding a permanently-busy broker. - Allow thread/read and thread/list through the broker concurrency gate during active streams so kill and threads commands work while a turn is running. - Add error logging to four bare catches: broker spawn lock re-acquire, thread file lock retry, app-server close in broker shutdown, and approval response parse failure (now includes reqId).
… elevation Security: - chmod broker socket to 0o700 after listen (multi-user protection) - Add mode:0o700 to all 16 production mkdirSync calls - Add mode:0o600 to broker/session state file writes - Validate --resume ID before using in file paths (path traversal) - Add 10MB buffer size limit on broker socket handlers (DoS) - Remove unnecessary shell:true from Windows taskkill Robustness: - Rewrite approval forwarding to use pending request map instead of dual-buffer (second data listener race condition) - Track target socket per pending approval request; only reject on matching socket disconnect - Add target socket validation on forwarded responses - Fix 60s broker approval timeout vs 1hr client timeout mismatch - Reject malformed responses missing both result and error fields - Log late/unknown forwarded response IDs - Clean up pending forwarded requests on broker shutdown - Keep broker stream exclusivity after client disconnect (prevent interleaved requests on shared app-server) - Cancel completion inference timer when non-final items arrive - Add error logging to signal handler, process termination, and resume-candidate discovery catch blocks - Handle realpathSync failure in resolveStateDir with fallback - Wrap copyFileSync in migration with try-catch - Create config directory before first config write - Move broker state persistence errors to call sites for proper handling (throw from helpers, try-catch in ensureConnection) - Gate discovery progress behind \!options.json for valid JSON output - Pass --dir option through to withClient in kill and review commands Type safety: - Export and deduplicate JSON-RPC type guards, PendingRequest, NotificationHandler, ServerRequestHandler (client.ts → broker-client.ts) - Add RpcError class to replace (err as any).rpcCode pattern - Add KnownThreadItem union and isKnownItem() guard to fix GenericItem poisoning discriminated union narrowing - Remove all 7 forced type assertions in events.ts and turns.ts - Type UserConfig fields precisely (ReasoningEffort, SandboxMode, ApprovalPolicy) and remove as any casts in applyUserConfig - Import BROKER_BUSY_RPC_CODE from broker.ts instead of redefining - Add warning log for invalid broker/session state shape - Check git exit status in getDiffStats - Log non-ENOENT errors in getUntrackedFiles
…ound task output handling - Remove Codex session ID and TUI resume command from printResult output — these were inherited from the official CC plugin and never used by the AI consumer - Remove redundant thread ID from printResult footer — already shown in the startup progress line - Remove unused shortId/threadId params from printResult signature - Consolidate duplicated background/sandbox execution instructions in SKILL.md into a single block - Fix SKILL.md telling Claude to never use TaskOutput, which prevented it from reading completed background task results — now distinguishes "while running" (don't poll) from "when notified" (read output) - Update TUI Handoff section to reflect removed output lines
Add onClose callback to AppServerClient interface — fires when the underlying connection closes unexpectedly (process exit or socket close), but not on intentional close(). The broker-server registers this callback on the app-server client to trigger immediate shutdown when the app-server dies, instead of sitting idle for up to 30 minutes rejecting every request. This ensures the next ensureConnection() spawns a fresh broker with a fresh app-server. The shutdownInitiated flag prevents double-shutdown when the intentional shutdown path (idle timeout, broker/shutdown RPC, SIGTERM/SIGINT) closes the app-server, which would otherwise trigger the onClose handler.
…lient edge cases broker-server.test.ts (29 tests, new file): - Concurrency control: busy rejection, stream ownership, turn/completed release, interrupt/read-only allowed during active stream - Approval forwarding: round-trip, malformed response, socket disconnect - Request forwarding and error forwarding to app-server - Notification routing to stream-owning socket - Socket permissions (0o700 after listen) - broker/shutdown RPC, idle timeout, activity timer reset - Multiple clients sequential access, client disconnect during stream - Buffer overflow protection, stale socket cleanup Uses real broker-server subprocess with mock codex CLI injected on PATH, lightweight TestClient over Unix socket. commands/shared.test.ts (97 tests, new file): - parseOptions: all flags (model, reasoning, sandbox, approval, timeout, limit, dir, resume, mode, ref, base, discover, all, json, content-only, unset), invalid values via subprocess, explicit set tracking, positional collection - pickBestModel: upgrade chains, -codex preference, circular guard - validateGitRef: accepts valid refs, rejects shell metacharacters - applyUserConfig: CLI beats config, explicit vs configured, invalid values warned, model alias resolution - turnOverrides: new thread vs resumed, explicit-only forwarding - formatDuration: edge cases (0ms, sub-second, minutes, hours) - isProcessAlive: missing PID file, dead/alive process, invalid PID broker.test.ts (15 tests, appended to existing): - Request timeout with pending request cleanup - Socket close/error during pending request - close() while requests pending, request after close() - Server-sent request (onRequest handler) round-trip - onClose callback: fires on unexpected close, not on intentional close, unsubscribe works - Buffer overflow protection (>10MB without newline)
pickBestModel (7 FAKE → 7 REAL):
- Export pickBestModel from shared.ts so tests call production code
- Replace subprocess copy-paste reimplementation with direct function
calls using properly typed Model objects
- Circular chain test confirmed correct behavior (returns "a" not "b")
broker-server.test.ts:
- Replace checkSocketSupport() early-return with describe.skipIf so
skipped tests are visible in test output (was 27 silent false-greens)
- Replace 2 source-code grep tests with real behavioral buffer
overflow test that sends >10MB and verifies socket destruction
- Fix 2 always-pass concurrency tests by adding gotBusy flag and
asserting it (previously both try/catch branches passed)
- Strengthen turn/interrupt assertion from toBeDefined() to toEqual({})
- Add limitation comments to tests where external verification is
impractical (notification forwarding, stderr warnings from subprocess)
broker.test.ts:
- Narrow regex assertions: socket close test expects specifically
"Broker connection closed", error test expects "Broker socket error"
(were both accepting either, making them indistinguishable)
Linux fixes (broker.test.ts): - Fix socket error test: capture rejection handler before triggering destroy to prevent bun:test unhandled rejection error - Fix close-while-pending test: same unhandled rejection pattern - Fix buffer overflow test: use drain-aware write loop with longer timeout instead of single large write that hits backpressure broker-server.test.ts: - Skip all broker-server integration tests on Windows (mock codex script uses Unix shebang, tests use unix: socket endpoints) - Fix buffer overflow test: use raw socket with drain-aware write loop instead of TestClient, handles kernel backpressure properly Windows fixes (broker.test.ts): - Skip createEndpoint Unix-format assertions on Windows - Skip all BrokerClient socket tests on Windows (unix: endpoints) Windows fixes (config.test.ts): - Normalize path comparisons with path.resolve() for forward/back slash differences (git returns forward slashes on Windows) - Use platform-appropriate temp directories instead of hardcoded /tmp - Use path.sep in resolveStateDir assertion Windows fixes (commands/shared.test.ts): - Use platform-conditional paths in --dir tests - Set USERPROFILE alongside HOME in subprocess env for applyUserConfig tests (Windows os.homedir() reads USERPROFILE, not HOME)
- Pass options.dir to withClient in run command — without this, run -d /other/repo uses the wrong workspace's broker, breaking isolation and causing spurious busy errors - Arm completion inference resolver before replaying buffered items — if a fast turn delivers final_answer before turn/start resolves, the buffered replay now triggers the debounce timer instead of waiting for the full timeout - Keep stream lock on socket error (not just close) — apply same sentinel pattern as the close handler to prevent a second client from interleaving on the shared app-server
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 86b3bc1deb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Broker correctness: - Clear completedStreamThreadIds on normal turn completion to unblock second turn on the same thread - Interrupt orphaned turns when requesting client disconnects mid-request - Kill leaked broker process when readiness check times out - Save pid: null (not stale pid) after killing timed-out broker Session management: - Compute sessionStartedAt alongside sessionId to prevent perpetual session churn when existing session expires Run ledger: - Map interrupted status to cancelled (not failed) in run records - Update run record on SIGINT/SIGTERM via new activeRunId tracking - Call pruneRuns after createRun to enforce maxRunsPerWorkspace cap CLI: - Handle --help after command name (was treated as prompt text) Turn lifecycle: - Arm completion inference for exitedReviewMode (reviews, not just agentMessage with final_answer) - Cancel inference timer on item/started to prevent premature completion synthesis when new work follows final_answer - Append multiple final_answer messages instead of overwriting
The broker enforces single-stream concurrency, which prevented parallel turn execution in the same workspace. When a second run/review is attempted while the broker is serving another client's turn, the CLI now spawns a standalone app-server via connectDirect instead of failing with -32001. Implementation: - Broker initialize response includes busy flag (true when a streaming turn is active) - connectToBroker exposes brokerBusy on the returned client - ensureConnection accepts a streaming parameter; when true and broker is busy, falls back to direct connection without tearing down broker - Non-streaming commands (kill, threads, etc.) still connect to the broker so they can inspect/interrupt the active turn - run and review pass streaming=true via withClient
…elevation - Replace deprecated `jobs` with `threads` in CLI command tables - Add `config` command to CLI tables - Update CLAUDE.md architecture notes for per-workspace state dirs, broker lifecycle, and run ledger - Update CONTRIBUTING.md architecture table with new modules (broker, commands/, process, git, reviews) and rename protocol.ts → client.ts
Replace copied adversarial-review prompt with an original template system that supports user-customizable templates with YAML frontmatter. Template system: - Frontmatter with name, description, and sandbox fields - parseTemplateFrontmatter strips frontmatter, returns metadata + body - loadTemplate returns body only; loadTemplateWithMeta returns both - listTemplates scans user (~/.codex-collab/templates/) and built-in dirs - CRLF line endings normalized for cross-platform compatibility - Template sandbox validated against allowed modes CLI integration: - --template <name> flag on run command wraps prompt in template - Template sandbox applied as default (marked explicit for resume) - codex-collab templates command lists available templates - Template ID always uses filename (not frontmatter name field) Install scripts: - Both install.sh and install.ps1 generate SKILL.md with injected template table from <\!-- TEMPLATES --> placeholder - Built-in prompts copied to scripts/prompts/ for production builds - bun run build also copies prompts directory Built-in template: - plan-review: reviews implementation plans against the codebase (read-only sandbox, structured output with file-specific findings)
Critical fixes: - Remove thr_ prefix guard from resolveThreadId so UUID-style thread IDs resolve correctly (threads.ts) - Extract broker-server data handler into processMessage with per-socket Promise queue to prevent async reentrancy on shared buffer; route approval responses synchronously to avoid deadlock (broker-server.ts) - Log session state save failures instead of swallowing them (broker.ts) Error handling improvements: - Log failed orphan-turn interrupts instead of suppressing (broker-server.ts) - Log close handler errors in client and broker-client - Check taskkill exit code on Windows (process.ts) Cleanup: - Remove dead clearSocketOwnership function (broker-server.ts) - Hoist isKnownItem Set to module scope to avoid per-call allocation (types.ts) - Fix stale --instructions error message in git.ts, mark duplicate resolveReviewTarget as deprecated
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Major internal refactor that restructures codex-collab from a monolithic CLI into a modular architecture with connection pooling, per-workspace state, and structured review support — while keeping the user-facing interface unchanged.
Migration: Re-run
install.sh. Thread state migrates automatically on first command. Thejobscommand is preserved as a deprecated alias forthreads.Architecture Changes
Modular CLI
cli.tsmonolith into a thin router (55 lines) with dynamic imports for each command (commands/run.ts,commands/review.ts,commands/threads.ts,commands/kill.ts,commands/config.ts,commands/approve.ts)commands/shared.ts: argument parsing, user config, client lifecycle, model auto-selection, thread start/resume orchestrationBroker (connection pooling)
broker-server.ts: Detached process that multiplexes multiple CLI invocations through a singlecodex app-serverinstance, reducing startup latency for sequential commandsbroker-client.ts: Socket-based client implementing the sameAppServerClientinterface as direct connectionsbroker.ts: Lifecycle management — spawn, liveness probe, fallback-to-direct, state persistence, advisory file lockingbroker/shutdownRPCProtocol layer
protocol.ts→client.tswith exported JSON-RPC type guards and shared typesAppServerClientinterface withonClosecallback for connection lifecycle eventsRpcErrorclass replacing(err as any).rpcCodepatternState management
~/.codex-collab/workspaces/{slug}-{hash}/) with automatic migration from global layoutresume-candidate --jsonfor skill-driven thread reuseNew modules
reviews.ts: Structured review output parsing, validation, adversarial review prompt templategit.ts: Diff stats, untracked file detection, review target resolutionprocess.ts: Platform-aware process tree termination (Unix SIGTERM→SIGKILL, Windows taskkill)events.ts: Enhanced dispatcher with phase dedup, final answer separation, log-based inferenceturns.ts: Notification buffering, completion inference (debounced), reasoning extractionSecurity Hardening
mkdirSynccalls usemode: 0o700(16 production calls audited)chmodto0o700after listenmode: 0o600--resumeID validated before use in file paths (path traversal prevention)shell: truefrom WindowstaskkillRobustness
onClosecallback) instead of sitting idle for 30 minutesresultanderror) properly rejected instead of leaving promise unsettledrealpathSyncfallback inresolveStateDirfor non-existent pathsthreads --discover --jsonno longer emits progress lines that break JSON outputfinal_answerType Safety
KnownThreadItemunion +isKnownItem()guard fixesGenericItempoisoning discriminated union narrowing — removed all 7 forcedascasts in events.ts and turns.tsPendingRequest, handler types betweenclient.tsandbroker-client.tsUserConfigfields typed precisely (ReasoningEffort,SandboxMode,ApprovalPolicy) — removedas anycastsBROKER_BUSY_RPC_CODEimported instead of redefinedToken Efficiency
Test Coverage
390 pass, 32 skip (socket-dependent, properly marked), 0 fail across 14 test files.
New test files:
broker-server.test.ts(29 tests): Concurrency control, approval forwarding, notification routing, socket permissions, idle timeout, app-server exit, buffer overflow — uses real subprocess broker with mock codex CLIcommands/shared.test.ts(97 tests):parseOptions(44),pickBestModel(7),validateGitRef(11),applyUserConfig(10),turnOverrides(5),formatDuration(8),isProcessAlive(6)broker.test.ts(+15 tests): Request timeout, socket close/error during request,onRequestround-trip,onClosecallback, buffer overflowAll tests audited for quality — fake tests (source-code greps, copy-paste reimplementations) replaced with real behavioral tests. Silent-skip pattern replaced with
describe.skipIffor visible skip reporting.Test plan
install.shandinstall.sh --devboth succeed on a clean checkoutcodex-collab healthafter installcodex-collab run "what does this project do?" --content-onlyand verify outputcodex-collab review --mode uncommitted --content-onlyand verify structured outputcodex-collab threadsshows the created threadscodex-collab kill <id>works on a running thread--resumewith both short ID and full thread ID formats~/.codex-collab/workspaces/*/broker.json)~/.codex-collab/doesn't exist (config directory creation)main, switch to this branch, verifythreadsstill lists them