Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
caafb1c
fix: PDF worker blob URL failure, sub-session metadata display, P2P t…
Apr 5, 2026
a85218c
fix: sub-session Stop button and usage badges missing in SubSessionWi…
Apr 5, 2026
7dcf7eb
fix: pinned sidebar sub-session panel missing usage/quota badges
Apr 5, 2026
18cf304
fix: sub-session Qwen quota/plan badges computed fresh instead of sta…
Apr 5, 2026
f56a0a4
fix: compact quota display — inline small text left of plan badge
Apr 5, 2026
dff10a0
fix: move plan/quota badges into shortcuts row, left of model selector
Apr 5, 2026
c7ed4d0
server: converge daemon upgrades to exact app version
Apr 5, 2026
ebf5c9f
daemon: unify watcher refresh for no-text recovery
Apr 5, 2026
7fbacd9
fix(conpty): inject env vars via spawn opts instead of POSIX export p…
Apr 5, 2026
dc6d7f4
p2p: harden parallel discussion evidence collection
Apr 5, 2026
3268361
test(watcher): add watcher refresh/retrack regression tests; fix mtim…
Apr 5, 2026
a29f9b5
fix(watcher): prevent message replay on session restart
Apr 5, 2026
df3c4b5
test(watcher): guard against full replay on session restart by default
Apr 5, 2026
37fb5d2
fix(watcher): remove emitRecentHistory from watcher startup entirely
Apr 5, 2026
9f62ed6
test: fix jsonl-watcher tests broken by emitRecentHistory removal fro…
Apr 5, 2026
010150e
test: fix macOS mtime race in jsonl-watcher-refresh test
Apr 5, 2026
bb67d18
fix(conpty): use absolute cmd.exe path and normalize CWD slashes on W…
Apr 5, 2026
f36a710
test: fix flaky tests — conpty cmd.exe assertion + p2p ENOENT race
Apr 5, 2026
3d45240
fix(test): advance runningFile mtime to fix macOS HFS+ mtime race in …
Apr 5, 2026
54125e1
Fix qwen transport queue drain timing
Apr 5, 2026
910f6e8
test: skip macos mtime-flaky p2p cleanup cases
Apr 5, 2026
9c1b42e
Fix codex retrack idle refresh
Apr 5, 2026
de96540
fix(windows): eliminate popup windows on daemon restart/upgrade/login
Apr 5, 2026
fd39689
fix(ci): update windows-daemon tests for watchdog tree-kill + remove …
Apr 5, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions openspec/changes/p2p-parallel-round-summary/design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
## Context

P2P discussions already support multi-round execution, but each hop is awaited serially. The current design uses a single shared context file that all hops append to, which is simple for a serial chain but becomes unsafe and hard to reason about once multiple hops run concurrently. The user wants the parallel version to preserve existing naming and prompt patterns, use an LLM-driven collection/synthesis step, and keep the main discussion file updated in place.

The key architectural constraint is that this is not a source-control merge system. The goal is to preserve multiple agents' viewpoints well enough that the summary step can synthesize them, not to perform perfect byte-for-byte file merging. The highest-value contract is clear hop/run state observability; minor formatting duplication is acceptable if the content remains attributable and summarizable.

## Goals / Non-Goals

**Goals:**
- Parallelize non-summary hops within each round.
- Keep the initiator kickoff and round summary as round barriers.
- Give each hop a dedicated temp file and reserve main-file writes for orchestrator collection plus summary append.
- Add explicit hop and round status contracts that can be relayed compatibly through shared types and existing consumers.
- Preserve the legacy top-level P2P progress projection through a daemon-owned compatibility layer.
- Preserve existing prompt naming and discussion heading conventions, with minimal prompt changes outside summary collection instructions.

**Non-Goals:**
- Building a perfect diff/merge engine for hop files.
- Rewriting existing P2P discussion UX beyond additive compatibility with richer run-update payloads.
- Changing the meaning of existing P2P modes or round prompt naming.
- Introducing a second long-lived persistence model for hop files beyond round-scoped temp artifacts.

## Decisions

### 1. Use per-hop temp files and keep the main discussion file single-writer during collection
Each phase-2 hop gets its own temp file for the round. That keeps concurrent writers off the main discussion file entirely. After the round barrier, the orchestrator collects the newly-added content from each hop file and appends it to the main discussion file. The summary step then reads the updated main file and appends its round summary section.

**Alternatives considered:**
- Direct concurrent writes to the main file: rejected because correctness depends on write interleaving behavior and makes attribution/debugging harder.
- Let the summary LLM perform all main-file structural writes: rejected because it weakens append-only guarantees, complicates retries, and reduces testability.

### 2. Identify hop-added content with a bounded byte-offset strategy
For each round, the orchestrator records the main discussion file size before creating hop temp files. Hop temp files are seeded from that main file snapshot. After each hop settles, the orchestrator treats content after the recorded byte offset as that hop's newly-added analysis.

The governing correctness rule is one-way:
- **missing completed-hop evidence is never acceptable**
- **minor duplication is acceptable**

If a hop file does not preserve the expected append-only structure well enough for exact byte-offset extraction, the implementation should prefer retaining attributable content over preserving perfect formatting or strict idempotency.

**Alternatives considered:**
- Heading-based parsing: workable, but more fragile if prompt formatting drifts.
- Whole-file concatenation: rejected because it reintroduces duplicated history into each round's summary input.

### 3. Treat summary as evidence collection + synthesis, not perfect reconstruction
The summary prompt reads the round's collected evidence and appends the round summary section. The system is allowed to preserve minor duplication or formatting noise as long as each hop's viewpoint remains attributable and summarizable.

**Alternatives considered:**
- Perfect diff/merge semantics with strict idempotency: useful but too heavy for the product goal.
- Blind concatenation of whole hop files: too likely to drown the summary in duplicated history.

### 4. Add explicit hop and run state contracts before wiring broader consumers
Parallel execution makes the old serial run status insufficient. The design therefore introduces explicit hop states and summary-phase run states first, then threads them through shared types, daemon orchestration, and additive downstream relay. This is the main guardrail against an implementation that “works” but is impossible to reason about in production.

The compatibility projection remains daemon-owned: the daemon serializer is responsible for emitting the legacy top-level `status`, phase, and progress fields expected by current consumers, while newer hop/run detail remains additive.

**Alternatives considered:**
- Keep only existing run-level status fields: rejected because parallel hops would be opaque.
- Emit only best-effort textual progress: rejected because it is not stable enough for tests or downstream compatibility.

### 5. Minimize prompt churn
The kickoff and hop prompts should keep existing naming and structure. Only the summary prompt gets a new instruction block telling the summary step to consider the round's collected hop findings and append the integrated round-summary section.

**Alternatives considered:**
- Rewriting every mode prompt around parallel execution: rejected because it creates unnecessary drift and retuning cost.

### 6. Keep server/web scope additive in this change
Most execution changes belong in the daemon orchestrator and shared contracts. Server relay and existing web consumers should remain compatible with richer run-update payloads, but this change does not require a new dedicated UI capability or full browser-side hop timeline redesign.

**Alternatives considered:**
- Expanding scope to fully redesign P2P UI progress handling: rejected as out of scope for this change.
- Keeping all new fields daemon-local: rejected because server/web still need additive compatibility.

## Risks / Trade-offs

- **[A hop's new analysis is partially missed]** → Mitigate by using a deterministic byte-offset baseline per round, testing divergent multi-hop outputs, and prioritizing content retention over perfect formatting.
- **[A hop rewrites or truncates its temp file instead of pure append]** → Mitigate by treating append-only structure as a best-effort expectation, preferring attributable content retention over strict exactness, and making missing completed-hop evidence a test failure.
- **[Parallel state transitions become hard to debug]** → Mitigate by defining hop/run status contracts up front and testing event ordering explicitly.
- **[Cross-project hops accidentally regain write access to the main file]** → Mitigate by making temp-file-only writes an explicit orchestration rule and copying cross-project hop artifacts back to the main project's hop-file location instead of the main discussion file.
- **[Temp files accumulate after crashes or cleanup failures]** → Mitigate by best-effort post-summary deletion plus orphan cleanup on later orchestrator startup or run initialization.
- **[Implementation drifts by copying dispatch logic]** → Mitigate by parameterizing `dispatchHop` instead of introducing a second near-duplicate control path.
- **[Richer payloads break downstream consumers]** → Mitigate by making new run-update fields additive and testing compatibility at the relay layer.

## Migration Plan

1. Define shared hop/run status constants and additive run-update payload shape.
2. Refactor daemon orchestration so `dispatchHop` accepts per-hop file/watch parameters without duplicating logic.
3. Add per-hop temp-file lifecycle, phase-2 parallel dispatch, and cross-project hop copy-back into round hop artifacts.
4. Add orchestrator-side evidence collection into the main discussion file and summary append flow.
5. Thread expanded run payloads through existing server relay and verify existing consumers remain compatible.
6. Land unit, integration, and event-order tests before enabling the new path by default.

Rollback is straightforward: switch orchestration back to the existing serial path and ignore hop temp files. The new shared status fields should remain additive so the serial path can still populate a compatible subset.

## Resolved Decisions

- Run updates SHALL expose both a compatibility-friendly top-level projection and a stable per-hop list for debugging/observers.
- Summary prompts do not need to restate failed/timed-out hop details verbatim; only completed-hop evidence is guaranteed to be collected into the main discussion file, while failed terminal states remain observable via run updates.
28 changes: 28 additions & 0 deletions openspec/changes/p2p-parallel-round-summary/proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
## Why

P2P multi-round discussion is currently fully serial, so total runtime grows with every hop in every round. That makes multi-agent audit/review discussions too slow, and it also forces all hops to write directly into one shared file, which makes round-level collection and status reporting hard to reason about.

## What Changes

- Run phase-2 hops in parallel within each round, while keeping the initiator kickoff and round summary sequential.
- Give each hop its own temporary discussion file, let the orchestrator collect each hop's newly added analysis into the main discussion file in place, and let the summary step append the round summary section.
- Standardize hop-level and run-level status updates so timeout, failure, cancel, and summary phases are observable.
- Make the daemon serializer explicitly responsible for preserving the legacy top-level P2P progress projection so richer hop/run fields remain additive for existing consumers.
- Preserve existing discussion naming and prompt structure, with only the summary prompt gaining explicit collection/synthesis instructions.
- Keep server/web scope additive: richer run-update payloads are relayed compatibly, without requiring a new P2P UI redesign in this change.

## Capabilities

### New Capabilities
- `p2p-parallel-orchestration`: Parallelize per-round hop execution with per-hop temp files, orchestrator-managed main-file collection, and summary-driven round synthesis.
- `p2p-hop-status`: Expose explicit hop and round status transitions for daemon progress tracking and additive downstream relay.

### Modified Capabilities
- `timeline-events`: Extend discussion-related run updates with additive hop-progress and summary-phase fields so downstream consumers can observe parallel execution without breaking existing payload handling.

## Impact

- **Daemon**: `src/daemon/p2p-orchestrator.ts`, prompt construction, temp-file management, timeout/cancel flow, and tests.
- **Shared**: New shared P2P status/event contract for hop- and round-level states.
- **Server**: Relay richer P2P run-update payloads compatibly.
- **Web**: Existing consumers of P2P run updates remain compatible with additive fields; full new UI behavior is out of scope for this change.
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
## ADDED Requirements

### Requirement: Each hop has a defined lifecycle state set
The system SHALL track each hop independently using the lifecycle states `queued`, `dispatched`, `running`, `completed`, `timed_out`, `failed`, and `cancelled`.

#### Scenario: Successful hop lifecycle
- **WHEN** a hop is accepted, dispatched, runs, and finishes normally
- **THEN** that hop transitions through defined lifecycle states ending in `completed`

#### Scenario: Timed-out hop lifecycle
- **WHEN** a hop exceeds its timeout budget
- **THEN** that hop transitions to `timed_out` and does not block other hops in the same round

#### Scenario: Failed hop lifecycle
- **WHEN** dispatch or execution fails for one hop
- **THEN** that hop transitions to `failed` while other hops in the same round continue toward the barrier

#### Scenario: Cancelled hop lifecycle
- **WHEN** the overall run is cancelled before a hop has completed
- **THEN** that hop transitions to `cancelled`

### Requirement: Run-level state distinguishes round execution from summary execution
The system SHALL expose run-level states that distinguish round execution from summary execution. At minimum, the run SHALL represent preparing, round execution, summarizing, completed, failed, and cancelled outcomes.

`preparing` SHALL mean the run has been created and is performing run-start or round-start setup before the first dispatch of that execution window. It SHALL exit when the initiator kickoff begins for round 1, or when a later round begins dispatch preparation if the implementation chooses to expose per-round preparation.

#### Scenario: Entering summary phase
- **WHEN** all hops in a round have reached terminal hop states
- **THEN** the run enters a summary-specific state before the summary step starts appending the round-summary section

#### Scenario: Summary completion advances run
- **WHEN** the summary step finishes for a non-final round
- **THEN** the run transitions back into round execution for the next round instead of directly completing

#### Scenario: Run-level transitions stay within the defined state machine
- **WHEN** a run changes top-level state
- **THEN** it only transitions along legal paths: `preparing -> round execution`, `round execution -> summarizing`, `round execution -> failed`, `round execution -> cancelled`, `summarizing -> round execution`, `summarizing -> completed`, `summarizing -> failed`, or `summarizing -> cancelled`

### Requirement: Hop terminal states have defined summary semantics
Only `completed` hops SHALL contribute collected evidence to the main discussion file. `timed_out`, `failed`, and `cancelled` hops SHALL remain observable in run updates but SHALL NOT be treated as successful evidence sources for that round.

#### Scenario: Partial failure still permits summary
- **WHEN** one hop fails or times out but other hops in the round complete
- **THEN** the run update reflects the non-completed hop terminal state and the summary phase may still start using only the completed hop evidence

#### Scenario: Zero completed hops still has defined behavior
- **WHEN** every hop in a round reaches `timed_out`, `failed`, or `cancelled` and zero hops complete
- **THEN** the run still enters the summary phase for that round, the main discussion file receives no completed-hop evidence for that round, and the summary step appends a summary section based on the empty-evidence outcome instead of silently skipping the round

### Requirement: Hop and run updates are relayed compatibly to observers
The daemon SHALL emit run updates that include hop-level status progress and summary-phase transitions as additive fields, and downstream relay behavior SHALL preserve compatibility for consumers that do not understand the new fields. The daemon serializer SHALL own this compatibility projection.

#### Scenario: Browser receives hop progress
- **WHEN** a hop transitions from running to completed
- **THEN** connected observers receive a run update that reflects that hop's new terminal state

#### Scenario: Additive compatibility for older consumers
- **WHEN** a downstream consumer reads a richer run-update payload but ignores hop-level fields
- **THEN** the existing run-update handling still succeeds without requiring new mandatory fields

#### Scenario: Legacy skipped compatibility is preserved
- **WHEN** a hop ends in any non-completed terminal state
- **THEN** the richer hop-level payload records the specific terminal state, and any legacy skip-oriented compatibility field remains an aggregate backward-compatible projection rather than a replacement for the detailed hop state

### Requirement: Cancellation preserves completed hop outcomes
The system SHALL preserve completed hop outcomes even when the overall run is cancelled.

#### Scenario: Cancel during phase-2 execution
- **WHEN** the user cancels a run while some hops have already completed and others are still running
- **THEN** completed hops remain marked completed, unfinished hops transition to cancelled, and no new summary phase starts
Loading
Loading